Re: [VOTE] The first HBase 1.4.2 release candidate (RC0) is available

2018-02-18 Thread Chia-Ping Tsai
> It will be a month until 1.4.3. Can this wait? Sounds bad but we already have 
> two releases out there where this didn't come up? Maybe nobody is making use 
> of BufferedMutator and 1.4 yet?
Got it. Let us concentrate on 1.4.2 rc.

On 2018/02/18 19:10:16, Andrew Purtell  wrote: 
> It will be a month until 1.4.3. Can this wait? Sounds bad but we already have 
> two releases out there where this didn't come up? Maybe nobody is making use 
> of BufferedMutator and 1.4 yet?
> 
> If you would like to veto this RC for this reason that's no problem, in which 
> case I will pick up the latest changes to branch-1.4 and try again with RC1 
> later this week. 
> 
> 
> > On Feb 18, 2018, at 1:54 AM, Chia-Ping Tsai  wrote:
> > 
> > hi Andrew,
> > 
> > Could we add HBASE-20017 to 1.4.2? The issue is related to data lose.
> > 
> >> On 2018/02/16 21:46:17, Andrew Purtell  wrote: 
> >> The first HBase 1.4.2 release candidate (RC0) is available for download at
> >> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.2RC0/ and Maven
> >> artifacts are available in the temporary repository
> >> https://repository.apache.org/content/repositories/orgapachehbase-1195/ .
> >> 
> >> The git tag corresponding to the candidate is '1.4.2RC0' (9519ec2ead).
> >> 
> >> A detailed source and binary compatibility report for this release is
> >> available for your review at https://dist.apache.org/repos/
> >> dist/dev/hbase/hbase-1.4.2RC0/compat-check-report.html .
> >> 
> >> A list of the 19 issues resolved in this release can be found at
> >> https://s.apache.org/aGcb .
> >> 
> >> Please try out the candidate and vote +1/0/-1.
> >> 
> >> This vote will be open for at least 72 hours. Unless objection I will try
> >> to close it Friday February 23, 2018 if we have sufficient votes.
> >> 
> >> Prior to making this announcement I made the following preflight checks:
> >> 
> >>   - RAT check passes (7u80)
> >>   - Unit test suite passes (8u131)
> >>   - LTT load 1M rows with 100% verification and 20% updates (8u131)
> >>   - PE sequentialWrite, sequentialRead, randomWrite, randomRead,
> >>   scanRange100 (8u131)
> >>   - ITBLL Loop 1 100M rows (8u131)
> >> 
> >> 
> >> -- 
> >> Best regards,
> >> Andrew
> >> 
> 


[jira] [Created] (HBASE-20018) Safe online META repair

2018-02-18 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-20018:
--

 Summary: Safe online META repair
 Key: HBASE-20018
 URL: https://issues.apache.org/jira/browse/HBASE-20018
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Reporter: Andrew Purtell


HBCK is a tank, or a giant shotgun, or choose the battlefield metaphor you feel 
is most appropriate. It rolls onto the field and leaves problems crushed in its 
wake, but if you point it in the wrong direction, it will also crush your 
production data too. As such it is a means of last resort to fix an ailing 
cluster. It is also imperative that user request traffic, writes in particular, 
are stopped before attempting a number of the fixes. It is unlikely the default 
"-repair" option is what you want - this turns on too many fixes to risk at one 
time. There are a large number of command line switches for individual checks 
and fixes which are very useful but also error prone when cobbling together a 
command line for a cluster fix under pressure. An operations team might 
hesitate to employ hbck to fix some accumulating bad state, because of the 
disruption use of it requires, and the risk of compounding the problem if not 
carefully done. That of course would be bad because the accumulating bad state 
will eventually have an availability impact. 

It should be safer to use hbck, but changing hbck also carries risk. We can 
leave it be as the useful (but dangerous) tool it is and focus on a subset of 
its functionality to make safer.

There are a class of META corruptions of mild to moderate severity which could 
in theory be handled more safely in an online manner without requiring a 
suspension of user traffic. Some things hbck does are safe enough to use 
directly for this. Others need tweaks to do more preflight checks (like 
checking region states) first. Develop these as a separate tool, maybe even a 
new HMaster or Admin component.

Look for opportunities to share code with existing hbck, via refactor into a 
shared library. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] The first HBase 1.4.2 release candidate (RC0) is available

2018-02-18 Thread Andrew Purtell
It will be a month until 1.4.3. Can this wait? Sounds bad but we already have 
two releases out there where this didn't come up? Maybe nobody is making use of 
BufferedMutator and 1.4 yet?

If you would like to veto this RC for this reason that's no problem, in which 
case I will pick up the latest changes to branch-1.4 and try again with RC1 
later this week. 


> On Feb 18, 2018, at 1:54 AM, Chia-Ping Tsai  wrote:
> 
> hi Andrew,
> 
> Could we add HBASE-20017 to 1.4.2? The issue is related to data lose.
> 
>> On 2018/02/16 21:46:17, Andrew Purtell  wrote: 
>> The first HBase 1.4.2 release candidate (RC0) is available for download at
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.2RC0/ and Maven
>> artifacts are available in the temporary repository
>> https://repository.apache.org/content/repositories/orgapachehbase-1195/ .
>> 
>> The git tag corresponding to the candidate is '1.4.2RC0' (9519ec2ead).
>> 
>> A detailed source and binary compatibility report for this release is
>> available for your review at https://dist.apache.org/repos/
>> dist/dev/hbase/hbase-1.4.2RC0/compat-check-report.html .
>> 
>> A list of the 19 issues resolved in this release can be found at
>> https://s.apache.org/aGcb .
>> 
>> Please try out the candidate and vote +1/0/-1.
>> 
>> This vote will be open for at least 72 hours. Unless objection I will try
>> to close it Friday February 23, 2018 if we have sufficient votes.
>> 
>> Prior to making this announcement I made the following preflight checks:
>> 
>>   - RAT check passes (7u80)
>>   - Unit test suite passes (8u131)
>>   - LTT load 1M rows with 100% verification and 20% updates (8u131)
>>   - PE sequentialWrite, sequentialRead, randomWrite, randomRead,
>>   scanRange100 (8u131)
>>   - ITBLL Loop 1 100M rows (8u131)
>> 
>> 
>> -- 
>> Best regards,
>> Andrew
>> 


Re: [VOTE] The first HBase 1.4.2 release candidate (RC0) is available

2018-02-18 Thread Chia-Ping Tsai
hi Andrew,

Could we add HBASE-20017 to 1.4.2? The issue is related to data lose.

On 2018/02/16 21:46:17, Andrew Purtell  wrote: 
> The first HBase 1.4.2 release candidate (RC0) is available for download at
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.2RC0/ and Maven
> artifacts are available in the temporary repository
> https://repository.apache.org/content/repositories/orgapachehbase-1195/ .
> 
> The git tag corresponding to the candidate is '1.4.2RC0' (9519ec2ead).
> 
> A detailed source and binary compatibility report for this release is
> available for your review at https://dist.apache.org/repos/
> dist/dev/hbase/hbase-1.4.2RC0/compat-check-report.html .
> 
> A list of the 19 issues resolved in this release can be found at
> https://s.apache.org/aGcb .
> 
> Please try out the candidate and vote +1/0/-1.
> 
> This vote will be open for at least 72 hours. Unless objection I will try
> to close it Friday February 23, 2018 if we have sufficient votes.
> 
> Prior to making this announcement I made the following preflight checks:
> 
>- RAT check passes (7u80)
>- Unit test suite passes (8u131)
>- LTT load 1M rows with 100% verification and 20% updates (8u131)
>- PE sequentialWrite, sequentialRead, randomWrite, randomRead,
>scanRange100 (8u131)
>- ITBLL Loop 1 100M rows (8u131)
> 
> 
> -- 
> Best regards,
> Andrew
> 


[jira] [Created] (HBASE-20017) BufferedMutatorImpl submit the same mutation repeatedly

2018-02-18 Thread Chia-Ping Tsai (JIRA)
Chia-Ping Tsai created HBASE-20017:
--

 Summary: BufferedMutatorImpl submit the same mutation repeatedly
 Key: HBASE-20017
 URL: https://issues.apache.org/jira/browse/HBASE-20017
 Project: HBase
  Issue Type: Bug
  Components: Client
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai
 Fix For: 2.0.0, 1.5.0, 1.4.3


BMI pass a iter of inner buffer to AccessProcess to take the undealt mutations, 
AsyncProcess call iter#next to get the mutation and then call iter#remove to 
delete the mutation from inner buffer. Hence, There's a good chance that  the 
mutation is processed repeatedly in case there are a bunch of threads which are 
running the flush.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)