[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646004#comment-16646004
 ] 

Hadoop QA commented on HBASE-21259:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
11s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 1 new + 113 
unchanged - 1 fixed = 114 total (was 114) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}115m 
23s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}152m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21259 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943348/HBASE-21259.branch-2.1.006.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 90356ce0625b 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / e283963533 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14642/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14642/testReport/ |
| Max. process+thread count | 4672 (vs. ulimit of 1) |
| modules | C: hbase-server U: 

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646005#comment-16646005
 ] 

stack commented on HBASE-21246:
---

I thought WALIdentity would be an opaque class that we'd pass around. It'd be 
its own specific Type. Here it seems WALIdentity is a String.

If its private, it can't be evolving

24  @InterfaceAudience.Private
25  @InterfaceStability.Evolving

How does a WALIdentity float free of a WAL Instance? You get one from the 
WALProvider? What do you do with it then? What is the wal param we pass? Why I 
need that?




> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21288:
---
Description: 
We have a case that a region shows status OPEN on a already dead server in meta 
table(it is hard to trace how this happen), meaning this region is actually not 
online. But balance came and scheduled a MoveReionProcedure for this region, 
which created a mess:
The balancer 'thought' this region was on the server which has the same 
address(but with different startcode). So it schedules a MRP from this online 
server to another, but the UnassignProcedure dispatch the unassign call to the 
dead server according to regionstate, which then found the server dead and 
schedule a SCP for the dead server. But since the UnassignProcedure's 
hostingServer is not accurate, the SCP can't interrupt it.
So, in the end, the SCP can't finish since the UnassignProcedure has the 
region' lock, the UnassignProcedure can not finish since no one wake it, thus 
stuck.

Here is log, notice that the server of the UnassignProcedure is 
'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was dispatch 
to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'

{code}
2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
assignment.RegionTransitionProcedure(230): Remote call failed 
hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
exception=NoServerDispatchException
org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584

//Then a SCP was scheduled
2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
server not online
2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. ,16000,1539088156164
2018-10-10 14:34:50,017 DEBUG [PEWorker-4] procedure2.ProcedureExecutor(1089): 
Stored pid=14, state=RUNNABLE:SERVER_CRASH_START, hasLock=false; 
ServerCrashProcedure server=hb-uf6oyi699w8h700f0-003.hbase.rds. 
,16020,1539076734964, splitWal=true, meta=false

//The SCP did not interrupt the UnassignProcedure but schedule new 
AssignProcedure for this region
2018-10-10 14:34:50,043 DEBUG [PEWorker-6] procedure.ServerCrashProcedure(250): 
Done splitting WALs pid=14, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
hasLock=true; ServerCrashProcedure server=hb-uf6oyi699w8h700f0-003.hbase.rds. 
,16020,1539076734964, splitWal=true, meta=false
2018-10-10 14:34:50,054 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1691): 
Initialized subprocedures=[{pid=15, ppid=14, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, {pid=16, ppid=14, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=hbase:req_intercept_rule, region=460481706415d776b3742f428a6f579b}, 
{pid=17, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
AssignProcedure table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}]
{code}


Here I also added a safe fence in balancer, if such regions are found, 
balancing is skipped for safe.It should do no harm.

  was:
We have a case that a region shows status OPEN on a already dead server in meta 
table(it is hard to trace how this happen), meaning this region is actually not 
online. But balance came and scheduled a MoveReionProcedure for this region, 
which created a mess:
The balancer 'thought' this region was on the server which has the same 
address(but with different startcode). So it schedules a MRP from this online 
server to another, but the UnassignProcedure dispatch the unassign call to the 
dead server according to regionstate, which then found the server dead and 
schedulre a SCP for the dead server. But since the UnassignProcedure's 
hostingServer is not accurate, the SCP can't interrupt it.
So, in the 

[jira] [Commented] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645999#comment-16645999
 ] 

Allan Yang commented on HBASE-21288:


{quote}
should we return false directly from here?
{quote}
I want to print all the abnormal regions here.

> HostingServer in UnassignProcedure is not accurate
> --
>
> Key: HBASE-21288
> URL: https://issues.apache.org/jira/browse/HBASE-21288
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, Balancer
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21288.branch-2.0.001.patch
>
>
> We have a case that a region shows status OPEN on a already dead server in 
> meta table(it is hard to trace how this happen), meaning this region is 
> actually not online. But balance came and scheduled a MoveReionProcedure for 
> this region, which created a mess:
> The balancer 'thought' this region was on the server which has the same 
> address(but with different startcode). So it schedules a MRP from this online 
> server to another, but the UnassignProcedure dispatch the unassign call to 
> the dead server according to regionstate, which then found the server dead 
> and schedulre a SCP for the dead server. But since the UnassignProcedure's 
> hostingServer is not accurate, the SCP can't interrupt it.
> So, in the end, the SCP can't finish since the UnassignProcedure has the 
> region' lock, the UnassignProcedure can finish since no one wake it, thus 
> stuck.
> Here is log, notice that the server of the UnassignProcedure is 
> 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was 
> dispatch to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'
> {code}
> 2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
> assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
> 2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
> assignment.RegionTransitionProcedure(230): Remote call failed 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
> exception=NoServerDispatchException
> org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584
> //Then a SCP was scheduled
> 2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
> Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
> server not online
> 2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
> Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
> ,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. 
> ,16000,1539088156164
> 2018-10-10 14:34:50,017 DEBUG [PEWorker-4] 
> procedure2.ProcedureExecutor(1089): Stored pid=14, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=false; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> //The SCP did not interrupt the UnassignProcedure but schedule new 
> AssignProcedure for this region
> 2018-10-10 14:34:50,043 DEBUG [PEWorker-6] 
> procedure.ServerCrashProcedure(250): Done splitting WALs pid=14, 
> state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> 2018-10-10 14:34:50,054 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1691): Initialized subprocedures=[{pid=15, 
> ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, 
> {pid=16, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:req_intercept_rule, 
> region=460481706415d776b3742f428a6f579b}, {pid=17, ppid=14, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
> table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}]
> {code}
> Here I also added a safe fence in balancer, if such regions are found, 
> 

[jira] [Commented] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645985#comment-16645985
 ] 

Xu Cang commented on HBASE-21288:
-

{quote}regionNotOnOnlineServer++;
{quote}
 

should we return false directly from here?

> HostingServer in UnassignProcedure is not accurate
> --
>
> Key: HBASE-21288
> URL: https://issues.apache.org/jira/browse/HBASE-21288
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, Balancer
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21288.branch-2.0.001.patch
>
>
> We have a case that a region shows status OPEN on a already dead server in 
> meta table(it is hard to trace how this happen), meaning this region is 
> actually not online. But balance came and scheduled a MoveReionProcedure for 
> this region, which created a mess:
> The balancer 'thought' this region was on the server which has the same 
> address(but with different startcode). So it schedules a MRP from this online 
> server to another, but the UnassignProcedure dispatch the unassign call to 
> the dead server according to regionstate, which then found the server dead 
> and schedulre a SCP for the dead server. But since the UnassignProcedure's 
> hostingServer is not accurate, the SCP can't interrupt it.
> So, in the end, the SCP can't finish since the UnassignProcedure has the 
> region' lock, the UnassignProcedure can finish since no one wake it, thus 
> stuck.
> Here is log, notice that the server of the UnassignProcedure is 
> 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was 
> dispatch to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'
> {code}
> 2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
> assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
> 2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
> assignment.RegionTransitionProcedure(230): Remote call failed 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
> exception=NoServerDispatchException
> org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584
> //Then a SCP was scheduled
> 2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
> Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
> server not online
> 2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
> Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
> ,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. 
> ,16000,1539088156164
> 2018-10-10 14:34:50,017 DEBUG [PEWorker-4] 
> procedure2.ProcedureExecutor(1089): Stored pid=14, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=false; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> //The SCP did not interrupt the UnassignProcedure but schedule new 
> AssignProcedure for this region
> 2018-10-10 14:34:50,043 DEBUG [PEWorker-6] 
> procedure.ServerCrashProcedure(250): Done splitting WALs pid=14, 
> state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> 2018-10-10 14:34:50,054 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1691): Initialized subprocedures=[{pid=15, 
> ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, 
> {pid=16, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:req_intercept_rule, 
> region=460481706415d776b3742f428a6f579b}, {pid=17, ppid=14, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
> table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}]
> {code}
> Here I also added a safe fence in balancer, if such regions are found, 
> balancing is skipped for 

[jira] [Commented] (HBASE-21278) TestMergeTableRegionsProcedure is flaky

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645981#comment-16645981
 ] 

stack commented on HBASE-21278:
---

bq. It does not make sense to rollback a successful procedure right?

Agree.

bq. so we do not need to rollback the sub procedures...

How to do this, prevent PE calling rollback on subprocedures? TRSP just ignores 
the calls?

The 'throw new UnsupportedOperationException(this + " unhandled state=" + 
state);' needs changing too?



> TestMergeTableRegionsProcedure is flaky
> ---
>
> Key: HBASE-21278
> URL: https://issues.apache.org/jira/browse/HBASE-21278
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO  [PEWorker-1] 
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS, 
> hasLock=false; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on 
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS, 
> hasLock=true; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645979#comment-16645979
 ] 

stack commented on HBASE-21288:
---

bq. UnassignProcedure dispatch the unassign call to the dead server according 
to regionstate, which then found the server dead and schedulre a SCP for the 
dead server. But since the UnassignProcedure's hostingServer is not accurate, 
the SCP can't interrupt it.

Interesting. I saw similar where the server was long gone and an unassign would 
fail. We'd schedule the SCP which would create a ServerStateNode only the 
unassigned Procedure was not 'registered' on the newly-created SSN so we'd not 
cleanup the stuck UP. Over in HBASE-21259, it will not schedule the SCP if the 
server seems to be unknown -- not online, not in deadservers, not in fs -- and 
in this case the UP goes to finish.

Maybe I was experiencing some of what you describe here also.

Good find [~allan163].

bq. So, in the end, the SCP can't finish since the UnassignProcedure has the 
region' lock, the UnassignProcedure can finish since no one wake it, thus stuck.

hbck2 can fix this now. The region is 'suspended'. You can call bypass on it to 
clear it and release the lock. Let me push my fixes so you can use the tool too.

Regards the patch, looks good to me. Should we purge holdingServer since we go 
to RegionStateNode always now to get actual hosting server?

> HostingServer in UnassignProcedure is not accurate
> --
>
> Key: HBASE-21288
> URL: https://issues.apache.org/jira/browse/HBASE-21288
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, Balancer
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21288.branch-2.0.001.patch
>
>
> We have a case that a region shows status OPEN on a already dead server in 
> meta table(it is hard to trace how this happen), meaning this region is 
> actually not online. But balance came and scheduled a MoveReionProcedure for 
> this region, which created a mess:
> The balancer 'thought' this region was on the server which has the same 
> address(but with different startcode). So it schedules a MRP from this online 
> server to another, but the UnassignProcedure dispatch the unassign call to 
> the dead server according to regionstate, which then found the server dead 
> and schedulre a SCP for the dead server. But since the UnassignProcedure's 
> hostingServer is not accurate, the SCP can't interrupt it.
> So, in the end, the SCP can't finish since the UnassignProcedure has the 
> region' lock, the UnassignProcedure can finish since no one wake it, thus 
> stuck.
> Here is log, notice that the server of the UnassignProcedure is 
> 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was 
> dispatch to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'
> {code}
> 2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
> assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
> 2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
> assignment.RegionTransitionProcedure(230): Remote call failed 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
> exception=NoServerDispatchException
> org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584
> //Then a SCP was scheduled
> 2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
> Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
> server not online
> 2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
> Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
> ,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. 
> ,16000,1539088156164
> 2018-10-10 14:34:50,017 DEBUG [PEWorker-4] 
> procedure2.ProcedureExecutor(1089): Stored pid=14, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=false; ServerCrashProcedure 
> 

[jira] [Commented] (HBASE-21278) TestMergeTableRegionsProcedure is flaky

2018-10-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645978#comment-16645978
 ] 

Duo Zhang commented on HBASE-21278:
---

For MergeTableRegionsProcedure, we will schedule new TRSPs to bring the two 
regions online, you can see the code. This is natural I think, as the original 
TRSPs have been finished successfully. It does not make sense to rollback a 
successful procedure right? Most developers will not consider that the a 
successful procedure can still be rolled back...

> TestMergeTableRegionsProcedure is flaky
> ---
>
> Key: HBASE-21278
> URL: https://issues.apache.org/jira/browse/HBASE-21278
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO  [PEWorker-1] 
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS, 
> hasLock=false; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on 
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS, 
> hasLock=true; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21103) nightly test cache of yetus install needs to be more thorough in verification

2018-10-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-21103:
---

Assignee: Sean Busbey

> nightly test cache of yetus install needs to be more thorough in verification
> -
>
> Key: HBASE-21103
> URL: https://issues.apache.org/jira/browse/HBASE-21103
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
>  Labels: beginner
>
> branch-1.2 nightly failed because it couldn't find yetus:
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/443/
> walking through steps, we checked and thought we had an existing install:
> https://builds.apache.org/blue/organizations/jenkins/HBase%20Nightly/detail/branch-1.2/443/pipeline/12
> {code}
> [HBase_Nightly_branch-1.2-AEAO7LJNYKPIH3O4GB2IR4TEIHRS6EXOJCJPGWYXWO6BFPUEZT3Q]
>  Running shell script
> Ensure we have a copy of Apache Yetus.
> Checking for Yetus 0.6.0 in 
> '/home/jenkins/jenkins-slave/workspace/HBase_Nightly_branch-1.2-AEAO7LJNYKPIH3O4GB2IR4TEIHRS6EXOJCJPGWYXWO6BFPUEZT3Q/yetus-0.6.0'
> Reusing cached download of Apache Yetus version 0.6.0.
> {code}
> So we stashed and then tried to use it.
> Examining the workspace present before the stash shows the directory we look 
> for was present, but the contents were garbage:
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/443/execution/node/3/ws/
> {code}
> $ unzip -l yetus-0.6.0.zip 
> Archive:  yetus-0.6.0.zip
>   Length  DateTimeName
> -  -- -   
> 0  00-00-1980 04:08   yetus-0.6.0/lib/precommit/qbt.sh
> - ---
> 0 1 file
> {code}
> we should probably check for an executable that will successfully give us a 
> version or something like that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-21103) nightly test cache of yetus install needs to be more thorough in verification

2018-10-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-21103 started by Sean Busbey.
---
> nightly test cache of yetus install needs to be more thorough in verification
> -
>
> Key: HBASE-21103
> URL: https://issues.apache.org/jira/browse/HBASE-21103
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
>  Labels: beginner
>
> branch-1.2 nightly failed because it couldn't find yetus:
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/443/
> walking through steps, we checked and thought we had an existing install:
> https://builds.apache.org/blue/organizations/jenkins/HBase%20Nightly/detail/branch-1.2/443/pipeline/12
> {code}
> [HBase_Nightly_branch-1.2-AEAO7LJNYKPIH3O4GB2IR4TEIHRS6EXOJCJPGWYXWO6BFPUEZT3Q]
>  Running shell script
> Ensure we have a copy of Apache Yetus.
> Checking for Yetus 0.6.0 in 
> '/home/jenkins/jenkins-slave/workspace/HBase_Nightly_branch-1.2-AEAO7LJNYKPIH3O4GB2IR4TEIHRS6EXOJCJPGWYXWO6BFPUEZT3Q/yetus-0.6.0'
> Reusing cached download of Apache Yetus version 0.6.0.
> {code}
> So we stashed and then tried to use it.
> Examining the workspace present before the stash shows the directory we look 
> for was present, but the contents were garbage:
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/443/execution/node/3/ws/
> {code}
> $ unzip -l yetus-0.6.0.zip 
> Archive:  yetus-0.6.0.zip
>   Length  DateTimeName
> -  -- -   
> 0  00-00-1980 04:08   yetus-0.6.0/lib/precommit/qbt.sh
> - ---
> 0 1 file
> {code}
> we should probably check for an executable that will successfully give us a 
> version or something like that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21288:
---
Attachment: HBASE-21288.branch-2.0.001.patch

> HostingServer in UnassignProcedure is not accurate
> --
>
> Key: HBASE-21288
> URL: https://issues.apache.org/jira/browse/HBASE-21288
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, Balancer
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21288.branch-2.0.001.patch
>
>
> We have a case that a region shows status OPEN on a already dead server in 
> meta table(it is hard to trace how this happen), meaning this region is 
> actually not online. But balance came and scheduled a MoveReionProcedure for 
> this region, which created a mess:
> The balancer 'thought' this region was on the server which has the same 
> address(but with different startcode). So it schedules a MRP from this online 
> server to another, but the UnassignProcedure dispatch the unassign call to 
> the dead server according to regionstate, which then found the server dead 
> and schedulre a SCP for the dead server. But since the UnassignProcedure's 
> hostingServer is not accurate, the SCP can't interrupt it.
> So, in the end, the SCP can't finish since the UnassignProcedure has the 
> region' lock, the UnassignProcedure can finish since no one wake it, thus 
> stuck.
> Here is log, notice that the server of the UnassignProcedure is 
> 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was 
> dispatch to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'
> {code}
> 2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
> assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
> 2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
> assignment.RegionTransitionProcedure(230): Remote call failed 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
> exception=NoServerDispatchException
> org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584
> //Then a SCP was scheduled
> 2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
> Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
> server not online
> 2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
> Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
> ,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. 
> ,16000,1539088156164
> 2018-10-10 14:34:50,017 DEBUG [PEWorker-4] 
> procedure2.ProcedureExecutor(1089): Stored pid=14, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=false; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> //The SCP did not interrupt the UnassignProcedure but schedule new 
> AssignProcedure for this region
> 2018-10-10 14:34:50,043 DEBUG [PEWorker-6] 
> procedure.ServerCrashProcedure(250): Done splitting WALs pid=14, 
> state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> 2018-10-10 14:34:50,054 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1691): Initialized subprocedures=[{pid=15, 
> ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, 
> {pid=16, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:req_intercept_rule, 
> region=460481706415d776b3742f428a6f579b}, {pid=17, ppid=14, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
> table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}]
> {code}
> Here I also added a safe fence in balancer, if such regions are found, 
> balancing is skipped for safe.It should do no harm.



--
This message was sent by Atlassian JIRA

[jira] [Updated] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21288:
---
Status: Patch Available  (was: Open)

> HostingServer in UnassignProcedure is not accurate
> --
>
> Key: HBASE-21288
> URL: https://issues.apache.org/jira/browse/HBASE-21288
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, Balancer
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21288.branch-2.0.001.patch
>
>
> We have a case that a region shows status OPEN on a already dead server in 
> meta table(it is hard to trace how this happen), meaning this region is 
> actually not online. But balance came and scheduled a MoveReionProcedure for 
> this region, which created a mess:
> The balancer 'thought' this region was on the server which has the same 
> address(but with different startcode). So it schedules a MRP from this online 
> server to another, but the UnassignProcedure dispatch the unassign call to 
> the dead server according to regionstate, which then found the server dead 
> and schedulre a SCP for the dead server. But since the UnassignProcedure's 
> hostingServer is not accurate, the SCP can't interrupt it.
> So, in the end, the SCP can't finish since the UnassignProcedure has the 
> region' lock, the UnassignProcedure can finish since no one wake it, thus 
> stuck.
> Here is log, notice that the server of the UnassignProcedure is 
> 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was 
> dispatch to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'
> {code}
> 2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
> assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
> 2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
> assignment.RegionTransitionProcedure(230): Remote call failed 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
> location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
> exception=NoServerDispatchException
> org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
> hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
> table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584
> //Then a SCP was scheduled
> 2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
> Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
> server not online
> 2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
> Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
> ,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. 
> ,16000,1539088156164
> 2018-10-10 14:34:50,017 DEBUG [PEWorker-4] 
> procedure2.ProcedureExecutor(1089): Stored pid=14, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=false; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> //The SCP did not interrupt the UnassignProcedure but schedule new 
> AssignProcedure for this region
> 2018-10-10 14:34:50,043 DEBUG [PEWorker-6] 
> procedure.ServerCrashProcedure(250): Done splitting WALs pid=14, 
> state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure 
> server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, 
> splitWal=true, meta=false
> 2018-10-10 14:34:50,054 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1691): Initialized subprocedures=[{pid=15, 
> ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, 
> {pid=16, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
> AssignProcedure table=hbase:req_intercept_rule, 
> region=460481706415d776b3742f428a6f579b}, {pid=17, ppid=14, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
> table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}]
> {code}
> Here I also added a safe fence in balancer, if such regions are found, 
> balancing is skipped for safe.It should do no harm.



--
This message was sent by Atlassian JIRA

[jira] [Created] (HBASE-21288) HostingServer in UnassignProcedure is not accurate

2018-10-10 Thread Allan Yang (JIRA)
Allan Yang created HBASE-21288:
--

 Summary: HostingServer in UnassignProcedure is not accurate
 Key: HBASE-21288
 URL: https://issues.apache.org/jira/browse/HBASE-21288
 Project: HBase
  Issue Type: Sub-task
  Components: amv2, Balancer
Affects Versions: 2.0.2, 2.1.0
Reporter: Allan Yang
Assignee: Allan Yang


We have a case that a region shows status OPEN on a already dead server in meta 
table(it is hard to trace how this happen), meaning this region is actually not 
online. But balance came and scheduled a MoveReionProcedure for this region, 
which created a mess:
The balancer 'thought' this region was on the server which has the same 
address(but with different startcode). So it schedules a MRP from this online 
server to another, but the UnassignProcedure dispatch the unassign call to the 
dead server according to regionstate, which then found the server dead and 
schedulre a SCP for the dead server. But since the UnassignProcedure's 
hostingServer is not accurate, the SCP can't interrupt it.
So, in the end, the SCP can't finish since the UnassignProcedure has the 
region' lock, the UnassignProcedure can finish since no one wake it, thus stuck.

Here is log, notice that the server of the UnassignProcedure is 
'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was dispatch 
to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964'

{code}
2018-10-10 14:34:50,011 INFO  [PEWorker-4] 
assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964
2018-10-10 14:34:50,011 WARN  [PEWorker-4] 
assignment.RegionTransitionProcedure(230): Remote call failed 
hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, 
location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; 
exception=NoServerDispatchException
org.apache.hadoop.hbase.procedure2.NoServerDispatchException: 
hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, 
server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584

//Then a SCP was scheduled
2018-10-10 14:34:50,012 WARN  [PEWorker-4] master.ServerManager(635): 
Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but 
server not online
2018-10-10 14:34:50,012 INFO  [PEWorker-4] master.ServerManager(615): 
Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. 
,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. ,16000,1539088156164
2018-10-10 14:34:50,017 DEBUG [PEWorker-4] procedure2.ProcedureExecutor(1089): 
Stored pid=14, state=RUNNABLE:SERVER_CRASH_START, hasLock=false; 
ServerCrashProcedure server=hb-uf6oyi699w8h700f0-003.hbase.rds. 
,16020,1539076734964, splitWal=true, meta=false

//The SCP did not interrupt the UnassignProcedure but schedule new 
AssignProcedure for this region
2018-10-10 14:34:50,043 DEBUG [PEWorker-6] procedure.ServerCrashProcedure(250): 
Done splitting WALs pid=14, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
hasLock=true; ServerCrashProcedure server=hb-uf6oyi699w8h700f0-003.hbase.rds. 
,16020,1539076734964, splitWal=true, meta=false
2018-10-10 14:34:50,054 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1691): 
Initialized subprocedures=[{pid=15, ppid=14, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, {pid=16, ppid=14, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=hbase:req_intercept_rule, region=460481706415d776b3742f428a6f579b}, 
{pid=17, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
AssignProcedure table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}]
{code}


Here I also added a safe fence in balancer, if such regions are found, 
balancing is skipped for safe.It should do no harm.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645971#comment-16645971
 ] 

Hudson commented on HBASE-20952:


Results for branch HBASE-20952
[build #14 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/14/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/14//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/14//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/14//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21278) TestMergeTableRegionsProcedure is flaky

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645957#comment-16645957
 ] 

stack commented on HBASE-21278:
---

bq. In the current design, when rolling back a procedure, we will rollback the 
sub procedures first. At least for MergeTableRegionsProcedure, this does not 
make sense.

... because? Is it be cause "There is no rollback for TRSP"?

Has the TRSP completed successfully? Can it 'ignore' the rollback request? Or, 
can we not schedule new TRSPs to do the MergeTableRegionsProcedure rollback?

We should do a writeup on rollback, and try and design how it should work?

> TestMergeTableRegionsProcedure is flaky
> ---
>
> Key: HBASE-21278
> URL: https://issues.apache.org/jira/browse/HBASE-21278
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO  [PEWorker-1] 
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS, 
> hasLock=false; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on 
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS, 
> hasLock=true; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21254) Need to find a way to limit the number of proc wal files

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645953#comment-16645953
 ] 

stack commented on HBASE-21254:
---

bq. stack I've replied on RB, so any other concerns? I think this is an 
important improvement.

I responded. Please make new patch with the nice explanations you made for me 
folded into the patch and commit that. +1. I'll try testing it post commit 
(Does it log loudly enough when it is forcing rewrite of a stuck WAL edit? 
That'd be handy).

Thanks.

> Need to find a way to limit the number of proc wal files
> 
>
> Key: HBASE-21254
> URL: https://issues.apache.org/jira/browse/HBASE-21254
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21254-v1.patch, HBASE-21254-v2.patch, 
> HBASE-21254.patch
>
>
> For regionserver, we have a max wal file limitation, if we reach the 
> limitation, we will trigger a flush on specific regions so that we can delete 
> old wal files. But for proc wals, we do not have this mechanism, and it will 
> be worse after HBASE-21233, as if there is an old procedure which can not 
> make progress and do not persist its state, we need to keep the old proc wal 
> file for ever...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21278) TestMergeTableRegionsProcedure is flaky

2018-10-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645951#comment-16645951
 ] 

Duo Zhang commented on HBASE-21278:
---

OK I found the problem. When rolling back a procedure, we also need to acquire 
the lock. And for TRSP, we need to wait until meta loaded, but when the 
procedures is woken up by the meta loaded event, we will add it into the 
scheduler and try to execute it, not rollback it...

I think the first thing here is to decide what is the correct behavior. In the 
current design, when rolling back a procedure, we will rollback the sub 
procedures first. At least for MergeTableRegionsProcedure, this does not make 
sense. There is no rollback for TRSP, and also, we will schedule new TRSPs to 
rollback the state when rolling back the MergeTableRegionsProcedure, so we do 
not need to rollback the sub procedures...



> TestMergeTableRegionsProcedure is flaky
> ---
>
> Key: HBASE-21278
> URL: https://issues.apache.org/jira/browse/HBASE-21278
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO  [PEWorker-1] 
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS, 
> hasLock=false; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on 
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS, 
> hasLock=true; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645948#comment-16645948
 ] 

Hadoop QA commented on HBASE-21282:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
42s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
54s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
54s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
26s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}215m 49s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}274m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.replication.TestMasterReplication |
|   | hadoop.hbase.client.TestAsyncTableGetMultiThreaded |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21282 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943327/HBASE-21282.001.branch-2.0.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  shadedjars  
hadoopcheck  xml  compile  |
| uname | Linux f8395a7745af 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / ed80fc5d6c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14640/artifact/patchprocess/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14640/testReport/ |
| Max. process+thread count | 4347 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14640/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: 

[jira] [Commented] (HBASE-17229) Backport of purge ThreadLocals

2018-10-10 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645947#comment-16645947
 ] 

ramkrishna.s.vasudevan commented on HBASE-17229:


OOO for personal reasons. No access to official emails during this period.


> Backport of purge ThreadLocals
> --
>
> Key: HBASE-17229
> URL: https://issues.apache.org/jira/browse/HBASE-17229
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 1.2.9
>
>
> Backport HBASE-17072 and HBASE-16146. The former needs to be backported to 
> 1.3 ([~mantonov]) and 1.2 ([~busbey]). The latter is already in 1.3.  Needs 
> to be backported to 1.2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21013) Backport "read part" of HBASE-18754 to all active 1.x branches

2018-10-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21013:

Fix Version/s: (was: 1.2.8)
   1.2.9

> Backport "read part" of HBASE-18754 to all active 1.x branches
> --
>
> Key: HBASE-21013
> URL: https://issues.apache.org/jira/browse/HBASE-21013
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Mingdao Yang
>Priority: Critical
> Fix For: 1.5.0, 1.3.3, 1.4.9, 1.2.9
>
>
> The hfiles impacted by HBASE-18754 will have bytes of proto.TimeRangeTracker. 
> It makes all 1.x branches failed to read the hfile since all 1.x branches 
> can't deserialize the proto.TimeRangeTracker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17229) Backport of purge ThreadLocals

2018-10-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-17229:

Fix Version/s: (was: 1.2.8)
   1.2.9

> Backport of purge ThreadLocals
> --
>
> Key: HBASE-17229
> URL: https://issues.apache.org/jira/browse/HBASE-17229
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 1.2.9
>
>
> Backport HBASE-17072 and HBASE-16146. The former needs to be backported to 
> 1.3 ([~mantonov]) and 1.2 ([~busbey]). The latter is already in 1.3.  Needs 
> to be backported to 1.2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20694) Consolidate warning on SecureBulkLoad directory permissions

2018-10-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20694:

Fix Version/s: (was: 1.2.8)
   1.2.9

> Consolidate warning on SecureBulkLoad directory permissions
> ---
>
> Key: HBASE-20694
> URL: https://issues.apache.org/jira/browse/HBASE-20694
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.9, 1.2.9
>
>
> Follow-on from HBASE-20605:
> HBase 1.x has a check which ignores a directory permission check if you're 
> using a specific filesystem which we think doesnt' do security properly.
> HBase 2.x dropped this check.
> Since the security of bulk-loaded data is dependent upon this directory 
> permission (and thus the capabilities of the FileSystem), it would be better 
> to have a consistent warning across branches.
> [~busbey] suggested that we make a WARN message which points admins to our 
> Book (and write such a section if we don't have something sufficient already) 
> and our supported filesystems, coupled with an option to disable that warning 
> message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21254) Need to find a way to limit the number of proc wal files

2018-10-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645906#comment-16645906
 ] 

Duo Zhang commented on HBASE-21254:
---

[~stack] I've replied on RB, so any other concerns? I think this is an 
important improvement.

> Need to find a way to limit the number of proc wal files
> 
>
> Key: HBASE-21254
> URL: https://issues.apache.org/jira/browse/HBASE-21254
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21254-v1.patch, HBASE-21254-v2.patch, 
> HBASE-21254.patch
>
>
> For regionserver, we have a max wal file limitation, if we reach the 
> limitation, we will trigger a flush on specific regions so that we can delete 
> old wal files. But for proc wals, we do not have this mechanism, and it will 
> be worse after HBASE-21233, as if there is an old procedure which can not 
> make progress and do not persist its state, we need to keep the old proc wal 
> file for ever...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21259:
--
Attachment: HBASE-21259.branch-2.1.006.patch

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch, HBASE-21259.branch-2.1.005.patch, 
> HBASE-21259.branch-2.1.006.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645898#comment-16645898
 ] 

stack commented on HBASE-21259:
---

.006 fixes checkstyle and the unit test (I made it so we are more exclusive 
about when we decide to not schedule a SCP...).

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch, HBASE-21259.branch-2.1.005.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645897#comment-16645897
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-20952 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
10s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
6s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} HBASE-20952 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 38s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}174m 
40s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}224m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21246 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943324/21246.HBASE-20952.005.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ed82d1e6a72b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | HBASE-20952 / 86cb8e48ad |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14638/artifact/patchprocess/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14638/testReport/ |
| Max. process+thread count | 4835 (vs. ulimit of 1) |
| modules | C: hbase-server U: 

[jira] [Commented] (HBASE-21287) JVMClusterUtil Master initialization wait time not configurable

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645895#comment-16645895
 ] 

Hadoop QA commented on HBASE-21287:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} hbase-server: The patch generated 0 new + 10 
unchanged - 1 fixed = 10 total (was 11) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}178m 
12s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}222m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21287 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943325/HBASE-21287.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 21bd24b19d89 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 72552301ab |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14639/testReport/ |
| Max. process+thread count | 5245 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645889#comment-16645889
 ] 

stack commented on HBASE-21259:
---

Thanks [~allan163]

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch, HBASE-21259.branch-2.1.005.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645851#comment-16645851
 ] 

Allan Yang commented on HBASE-21259:


+1 for the 005 patch

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch, HBASE-21259.branch-2.1.005.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645843#comment-16645843
 ] 

Hadoop QA commented on HBASE-21259:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
27s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
14s{color} | {color:red} hbase-server: The patch generated 9 new + 113 
unchanged - 1 fixed = 122 total (was 114) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
36s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.TestMasterWALManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21259 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943331/HBASE-21259.branch-2.1.005.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 98ca7bfa7c80 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / e283963533 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14641/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 

[jira] [Commented] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645817#comment-16645817
 ] 

Hadoop QA commented on HBASE-21285:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1.4 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
36s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
17s{color} | {color:red} hbase-server: The patch generated 7 new + 19 unchanged 
- 1 fixed = 26 total (was 20) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
6m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 45s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.mapreduce.TestTableSnapshotInputFormat |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:2cf636a |
| JIRA Issue | HBASE-21285 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943321/HBASE-21285.branch-1.4.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  

[jira] [Commented] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645755#comment-16645755
 ] 

Mike Drob commented on HBASE-21282:
---

Oh, right, they're ancient so this compat issue won't matter there. LGTM.

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21282.001.branch-2.0.patch
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21281) Update bouncycastle dependency.

2018-10-10 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645757#comment-16645757
 ] 

Mike Drob commented on HBASE-21281:
---

+1

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21281) Update bouncycastle dependency.

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645747#comment-16645747
 ] 

Ted Yu commented on HBASE-21281:


lgtm

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21259:
--
Attachment: HBASE-21259.branch-2.1.005.patch

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch, HBASE-21259.branch-2.1.005.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21287) JVMClusterUtil Master initialization wait time not configurable

2018-10-10 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645733#comment-16645733
 ] 

Mike Drob commented on HBASE-21287:
---

Yea, lots of follow on here...

Make it take env variables or easier to expose on tests/command line. Use 
Futures and proper timeout mechanisms. Will commit after QA comes back, thanks 
for review Stack!

> JVMClusterUtil Master initialization wait time not configurable
> ---
>
> Key: HBASE-21287
> URL: https://issues.apache.org/jira/browse/HBASE-21287
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21287.patch
>
>
> We can configure how long the local cluster threads will wait for master to 
> come up and become active, but not how long we allow initialization to take. 
> Being able to tune this would improve my test loop on some experiment I am 
> running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-21246:
---
Fix Version/s: HBASE-20952

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645729#comment-16645729
 ] 

Josh Elser commented on HBASE-21282:


{quote}You tested both Hadoop 2 and Hadoop 3? If so, I'm +1
{quote}
Didn't explicitly recheck Hadoop 2 (saw that was still on mortbay jetty 6.1 
:eyeroll: ). I can do that though. I assume 2.7.7 is best (what we have in the 
pom)?

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21282.001.branch-2.0.patch
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21287) JVMClusterUtil Master initialization wait time not configurable

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645727#comment-16645727
 ] 

stack commented on HBASE-21287:
---

Seems good. +1.

Yeah, we do this in loads of places. Could become more generic and used 
everywhere... Follow-on.

> JVMClusterUtil Master initialization wait time not configurable
> ---
>
> Key: HBASE-21287
> URL: https://issues.apache.org/jira/browse/HBASE-21287
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21287.patch
>
>
> We can configure how long the local cluster threads will wait for master to 
> come up and become active, but not how long we allow initialization to take. 
> Being able to tune this would improve my test loop on some experiment I am 
> running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645722#comment-16645722
 ] 

Mike Drob commented on HBASE-21282:
---

You tested both Hadoop 2 and Hadoop 3? If so, I'm +1

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21282.001.branch-2.0.patch
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645719#comment-16645719
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-20952 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
58s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
53s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} HBASE-20952 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 2s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}195m 
59s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}239m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21246 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943290/21246.HBASE-20952.004.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux df9902c4dc22 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | HBASE-20952 / 86cb8e48ad |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14634/artifact/patchprocess/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14634/testReport/ |
| Max. process+thread count | 4771 (vs. ulimit of 1) |
| modules | C: hbase-server U: 

[jira] [Commented] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645711#comment-16645711
 ] 

Josh Elser commented on HBASE-21282:


Oh, and in case it helps to explicitly state it: there are known issues filed 
against these Jetty versions which static analysis tools have/will catch. I 
expected them to follow suit (eventually).

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21282.001.branch-2.0.patch
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-21282:
---
Status: Patch Available  (was: Open)

.001 bumps up to 9.2 and 9.3 latest versions of Jetty for our build. Not 
noticing any problems with Hadoop 3.1.1 (which has the same version). WebUIs 
are up, MapReduce jobs work, and the REST server works.

You have other kinds of testing in mind, [~mdrob]?

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21282.001.branch-2.0.patch
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-21282:
---
Attachment: HBASE-21282.001.branch-2.0.patch

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21282.001.branch-2.0.patch
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21287) JVMClusterUtil Master initialization wait time not configurable

2018-10-10 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21287:
--
Status: Patch Available  (was: Open)

> JVMClusterUtil Master initialization wait time not configurable
> ---
>
> Key: HBASE-21287
> URL: https://issues.apache.org/jira/browse/HBASE-21287
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21287.patch
>
>
> We can configure how long the local cluster threads will wait for master to 
> come up and become active, but not how long we allow initialization to take. 
> Being able to tune this would improve my test loop on some experiment I am 
> running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21287) JVMClusterUtil Master initialization wait time not configurable

2018-10-10 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21287:
--
Attachment: HBASE-21287.patch

> JVMClusterUtil Master initialization wait time not configurable
> ---
>
> Key: HBASE-21287
> URL: https://issues.apache.org/jira/browse/HBASE-21287
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21287.patch
>
>
> We can configure how long the local cluster threads will wait for master to 
> come up and become active, but not how long we allow initialization to take. 
> Being able to tune this would improve my test loop on some experiment I am 
> running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21287) JVMClusterUtil Master initialization wait time not configurable

2018-10-10 Thread Mike Drob (JIRA)
Mike Drob created HBASE-21287:
-

 Summary: JVMClusterUtil Master initialization wait time not 
configurable
 Key: HBASE-21287
 URL: https://issues.apache.org/jira/browse/HBASE-21287
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: Mike Drob
Assignee: Mike Drob


We can configure how long the local cluster threads will wait for master to 
come up and become active, but not how long we allow initialization to take. 
Being able to tune this would improve my test loop on some experiment I am 
running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.HBASE-20952.005.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645701#comment-16645701
 ] 

Hadoop QA commented on HBASE-21259:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-21259 does not apply to branch-2.1. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-21259 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943323/HBASE-21259.branch-2.1.004.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14637/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645700#comment-16645700
 ] 

stack commented on HBASE-21259:
---

.004 Fix checkstyle, findbugs, and unit tests. Remove the DEBUG I'd left in 
place.

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-10 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21259:
--
Attachment: HBASE-21259.branch-2.1.004.patch

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch, 
> HBASE-21259.branch-2.1.002.patch, HBASE-21259.branch-2.1.003.patch, 
> HBASE-21259.branch-2.1.004.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645695#comment-16645695
 ] 

Hudson commented on HBASE-21283:


Results for branch branch-2
[build #1369 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1369/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1369//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1369//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1369//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Lavinia-Stefania Sirbu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645694#comment-16645694
 ] 

Lavinia-Stefania Sirbu commented on HBASE-21285:


Sure [~yuzhih...@gmail.com], the result of my experiments are:
* Number of regions = 20 (~200 hfiles) => getRegionSizes ~25ms
* Number of regions = 500 (~9000 hfiles) => getRegionSizes ~100ms
* Number of regions = 23000 regions (~216000 hfiles) => getRegionSizes ~3.5s

> Enhanced TableSnapshotInputFormat to allow a size based splitting
> -
>
> Key: HBASE-21285
> URL: https://issues.apache.org/jira/browse/HBASE-21285
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21285.branch-1.4.001.patch, 
> HBASE-21285.branch-1.4.002.patch
>
>
> Currently, all the splits generated by a snapshot are having length 0. Right 
> now, we have a configuration for the number of splits per region, but it's a 
> general one and not very helpful when the sizes for regions are really 
> different. The modification must be done in TableSnapshotInputFormatImpl 
> where the length must be computed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Lavinia-Stefania Sirbu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavinia-Stefania Sirbu updated HBASE-21285:
---
Status: Open  (was: Patch Available)

> Enhanced TableSnapshotInputFormat to allow a size based splitting
> -
>
> Key: HBASE-21285
> URL: https://issues.apache.org/jira/browse/HBASE-21285
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21285.branch-1.4.001.patch, 
> HBASE-21285.branch-1.4.002.patch
>
>
> Currently, all the splits generated by a snapshot are having length 0. Right 
> now, we have a configuration for the number of splits per region, but it's a 
> general one and not very helpful when the sizes for regions are really 
> different. The modification must be done in TableSnapshotInputFormatImpl 
> where the length must be computed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Lavinia-Stefania Sirbu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavinia-Stefania Sirbu updated HBASE-21285:
---
Attachment: HBASE-21285.branch-1.4.002.patch
Status: Patch Available  (was: Open)

> Enhanced TableSnapshotInputFormat to allow a size based splitting
> -
>
> Key: HBASE-21285
> URL: https://issues.apache.org/jira/browse/HBASE-21285
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21285.branch-1.4.001.patch, 
> HBASE-21285.branch-1.4.002.patch
>
>
> Currently, all the splits generated by a snapshot are having length 0. Right 
> now, we have a configuration for the number of splits per region, but it's a 
> general one and not very helpful when the sizes for regions are really 
> different. The modification must be done in TableSnapshotInputFormatImpl 
> where the length must be computed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645687#comment-16645687
 ] 

Josh Elser commented on HBASE-21246:


{code:java}
+   * For the FS based path, it will be just a filename of whole path
+   * For stream based, it will be name of the stream{code}
I dont' see these adding value. Doesn't belong in the interface (can be a 
comment on implementation if necessary).
{quote}This would be used in the (Ankit's) upcoming patch for replication.
{quote}
Introduce it then, please.
{quote}The start time of a WAL is known at the creation of the WAL. This 
attribute would not change even after edits are appended to the WAL.
 Therefore I think it belongs to the identity.
{quote}
My take is that for most cases, we would be querying the backend system to 
obtain this information (as opposed to having it encoded in the name like we do 
for HDFS). If it were always something we encode in the names, I could see 
putting it into the WALIdentity, but I think HDFS is the exception, not the 
rule.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645591#comment-16645591
 ] 

Hadoop QA commented on HBASE-21266:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
2s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} hbase-server: The patch generated 0 new + 3 
unchanged - 5 fixed = 3 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 49s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestZKLessSplitOnCluster |
|   | hadoop.hbase.regionserver.TestEndToEndSplitTransaction |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | HBASE-21266 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943284/HBASE-21266-branch-1.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645553#comment-16645553
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
17s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 4 new + 11 unchanged 
- 0 fixed = 15 total (was 11) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
18s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
35s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}132m 
29s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21246 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943281/21246.003.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2f5a06b733ef 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 72552301ab |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14632/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| whitespace | 

[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645505#comment-16645505
 ] 

Hadoop QA commented on HBASE-21266:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
17s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} hbase-server: The patch generated 0 new + 3 
unchanged - 5 fixed = 3 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
17s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m  5s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestReplicasClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | HBASE-21266 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943273/HBASE-21266-branch-1.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 

[jira] [Updated] (HBASE-21247) Allow WAL Provider to be specified by configuration without explicit enum in Providers

2018-10-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the review, Josh.

> Allow WAL Provider to be specified by configuration without explicit enum in 
> Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue introduces additional config which allows the specification of new 
> WAL Provider through class name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645489#comment-16645489
 ] 

Hadoop QA commented on HBASE-21266:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} hbase-server: The patch generated 0 new + 3 
unchanged - 5 fixed = 3 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
45s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 
17s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | HBASE-21266 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943273/HBASE-21266-branch-1.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 1bd0a354d67a 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 

[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645461#comment-16645461
 ] 

Andrew Purtell commented on HBASE-21266:


With latest patch this issue cannot be reproduced and the AM is stable in ITBLL 
testing 500M rows with serverKilling chaos policy, which completes 
successfully. Verified with added debug logging in DeadServer, periodic hbck 
invocation (cluster always returned to a 0 inconsistencies detected state), and 
periodic balancer invocation, and the unit test suite. 

We no longer rely on an integer counter and boolean to track the processing 
status of dead servers. Instead DeadServer uses a Set from which expected state 
checks are derived, logging is improved, and there is a new runtime visible 
assert for incorrect API usage (which doesn't assert in any testing). 

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645443#comment-16645443
 ] 

Ted Yu commented on HBASE-21246:


I ran TestMasterFailoverWithProcedures with patch locally which passed.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.HBASE-20952.004.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645400#comment-16645400
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-20952 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
54s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
52s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
10s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} HBASE-20952 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 5 new + 11 unchanged 
- 0 fixed = 16 total (was 11) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
16s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
29s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 22s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21246 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943263/21246.HBASE-20952.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 84a8073f14aa 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | HBASE-20952 / 86cb8e48ad |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 

[jira] [Commented] (HBASE-21251) Refactor RegionMover

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645398#comment-16645398
 ] 

Hudson commented on HBASE-21251:


Results for branch branch-2.1
[build #444 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/444/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/444//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/444//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/444//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Refactor RegionMover
> 
>
> Key: HBASE-21251
> URL: https://issues.apache.org/jira/browse/HBASE-21251
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21251.master.001.patch, 
> HBASE-21251.master.002.patch, HBASE-21251.master.003.patch, 
> HBASE-21251.master.004.patch
>
>
> 1. Move connection and admin to RegionMover's member variables. No need 
> create connection many times.
> 2. use try-with-resource to reduce code
> 3. use ServerName instead of String
> 4. don't use Deprecated method
> 5. remove duplicate code
> ..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Attachment: HBASE-21266-branch-1.patch

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Status: Patch Available  (was: Open)

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Status: Open  (was: Patch Available)

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21282) Upgrade Jetty dependencies to latest in major-line

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645376#comment-16645376
 ] 

Josh Elser commented on HBASE-21282:


OK – so master already has switched over to the apache-jsp compiler.

HBASE-19995 comes into play due to worries about Hadoop compatibility. Let me 
poke at this. I'd certainly hope this is getting better.

> Upgrade Jetty dependencies to latest in major-line
> --
>
> Key: HBASE-21282
> URL: https://issues.apache.org/jira/browse/HBASE-21282
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
>
> Looks like we have dependencies on both jetty 9.2 and 9.3, but we're lagging 
> pretty far behind in both. We can upgrade both of these to the latest (august 
> 2018).
>  
> I'll also have to take a look at why we're using two separate versions (maybe 
> we didn't want to switch from jetty-jsp to apache-jsp on 9.2->9.3?). Not sure 
> if there's a good reason for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645355#comment-16645355
 ] 

Ted Yu commented on HBASE-21246:


bq. Why isn't this a WALProvider method?

The start time of a WAL is known at the creation of the WAL. This attribute 
would not change even after edits are appended to the WAL.
Therefore I think it belongs to the identity.

bq. Seems unused.

This would be used in the (Ankit's) upcoming patch for replication.

The other comments are addressed by patch v3.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.003.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645333#comment-16645333
 ] 

Josh Elser commented on HBASE-21246:


{code:java}
+  static WALIdentity UNKNOWN = new WALIdentity() {
+
+@Override
+public long getWalStartTime() {
+  return 0;
+}
+
+@Override
+public String getName() {
+  return "UNKNOWN";
+}
+
+@Override
+public int compareTo(WALIdentity o) {
+  if (o == UNKNOWN) {
+return 0;
+  }
+  return -1;
+}
+
+@Override
+public boolean equals(Object o) {
+  if (!(o instanceof WALIdentity)) {
+return false;
+  }
+  if (compareTo((WALIdentity)o) == 0) {
+return true;
+  }
+  return false;
+}
+
+@Override
+public int hashCode() {
+  return 0;
+}
+  };{code}
Seems unused. Remove it?
{code:java}
+  /**
+   * Creates WALIdentity for WAL path/name
+   * @param wal
+   * @return WALIdentity instance for the WAL
+   */{code}
Same here as the {{getName()}} comment above. Make it clear that this name 
should be uniquely identifying a WAL in this WALProvider.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645331#comment-16645331
 ] 

Josh Elser commented on HBASE-21246:


{code:java}
+  /**
+   * Starting time of the wal which help in sorting against the others
+   * @return start time of the wal
+   */
+  long getWalStartTime();{code}
Why isn't this a {{WALProvider}} method?
{code:java}
+  /**
+   * For the FS based path, it will be just a filename of whole path
+   * For stream based, it will be name of the stream
+   * @return name of the wal
+   */
+  String getName();{code}
This should be stating what this method needs to do (so custom WAL providers 
know how to implement it). Like WALIdentity is uniquely identifying a WAL 
stored in this WALProvider, this name can be thought of as a human-readable, 
serialized form of the WALIdentity. Making sure that developers know that this 
needs to be unique/stable is important.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21281) Update bouncycastle dependency.

2018-10-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645327#comment-16645327
 ] 

Josh Elser commented on HBASE-21281:


Seems to work fine. These test failures seem to be unrelated to this change 
(and the ones that actually use security work)

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21284) Forward port HBASE-21000 to branch-2

2018-10-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645308#comment-16645308
 ] 

stack commented on HBASE-21284:
---

Sounds great [~liuml07]. Maybe best if it goes into branch-2 so it will show up 
in hbase-2.2.x? What do you think? Thanks.

> Forward port HBASE-21000 to branch-2
> 
>
> Key: HBASE-21284
> URL: https://issues.apache.org/jira/browse/HBASE-21284
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Priority: Major
>
> See parent for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645302#comment-16645302
 ] 

Andrew Purtell commented on HBASE-21283:


Manually confirmed again before push that the command works (branch-1) or at 
least doesn't fail due to API differences between branch-1 and later (branch-2, 
master)

> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-10-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21283:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-1, branch-2, and master. Thanks for the reviews 
[~yuzhih...@gmail.com] [~mdrob]

> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-10-10 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645286#comment-16645286
 ] 

Mike Drob commented on HBASE-21283:
---

+1

> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-21098:
--

Assignee: Tyler Mi

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Assignee: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.branch-1.002.patch, HBASE-21098.master.001.patch, 
> HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, 
> HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, 
> HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, 
> HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, 
> HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, 
> HBASE-21098.master.012.patch, HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645283#comment-16645283
 ] 

Andrew Purtell commented on HBASE-21098:


For future reference [~zyork] you can add jira users as contributors via 
https://issues.apache.org/jira/plugins/servlet/project-config/HBASE/roles . I 
went ahead and added Tyler and assigned this issue. 

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.branch-1.002.patch, HBASE-21098.master.001.patch, 
> HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, 
> HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, 
> HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, 
> HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, 
> HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, 
> HBASE-21098.master.012.patch, HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645279#comment-16645279
 ] 

Andrew Purtell commented on HBASE-21098:


[~zyork] I hit that too. Mention it on builds@? At any rate you can ignore it, 
or use 'xmllint' or similar tool to check locally for yourself that the file is 
well formed. (I did the latter.)

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.branch-1.002.patch, HBASE-21098.master.001.patch, 
> HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, 
> HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, 
> HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, 
> HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, 
> HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, 
> HBASE-21098.master.012.patch, HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21284) Forward port HBASE-21000 to branch-2

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645273#comment-16645273
 ] 

Andrew Purtell commented on HBASE-21284:


Go ahead [~liuml07] but first ask [~stack] if he even wants it. 

> Forward port HBASE-21000 to branch-2
> 
>
> Key: HBASE-21284
> URL: https://issues.apache.org/jira/browse/HBASE-21284
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Priority: Major
>
> See parent for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645258#comment-16645258
 ] 

Andrew Purtell commented on HBASE-21283:


Great the rubocop results for new file rit.rb are clean now, going to commit

> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645255#comment-16645255
 ] 

Andrew Purtell commented on HBASE-21266:


Update patch to address checkstyle ImportOrder warning

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Attachment: HBASE-21266-branch-1.patch

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645253#comment-16645253
 ] 

Andrew Purtell commented on HBASE-21266:


All unit tests pass for me locally. The remaining failures in precommit do not 
seem related to this change but I'll come back and look at them again. 

Now moving to ITBLL testing of the latest.

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.HBASE-20952.002.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638994#comment-16638994
 ] 

Ted Yu edited comment on HBASE-21246 at 10/10/18 3:56 PM:
--

The first question is addressed by patch v2 which drops the method.

The abstraction of WAL identity should allow all backends (hdfs, Ratis, Kafka) 
to express the information the underlying backend carries for each WAL.
e.g. for Kafka backend, we can use topic to represent region, the partitions 
under the topic would correspond to the WALs for the region.
Therefore a String alone doesn't have as much expressibility as an interface.

That is why we chose to use an interface to represent WAL identity.


was (Author: yuzhih...@gmail.com):
Now that design doc is back to review mode, please allow some more time till I 
address the above comment.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21286) Parallelize computeHDFSBlocksDistribution when getting splits of a HBaseSnapshot

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645162#comment-16645162
 ] 

Ted Yu commented on HBASE-21286:


Where is the ExecutorService shutdown ?
{code}
432 ExecutorService executor = Executors.newFixedThreadPool(nrThreads);
{code}
If you can share some data on performance improvement, that would be nice.

> Parallelize computeHDFSBlocksDistribution when getting splits of a 
> HBaseSnapshot
> 
>
> Key: HBASE-21286
> URL: https://issues.apache.org/jira/browse/HBASE-21286
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21286.branch-1.4.001.patch
>
>
> Even if this step is called computeHDFSBlocksDistribution, this is executed 
> no matter the file system of the snapshot. For example, we have observed an 
> important slowness when we have a snapshot in s3 (~26k regions, 5column 
> families, 2 files per column family) the getsplits time is ~40min due to the 
> calls in s3 for listing the files to get the best locations.
> Parallelizing this operation can reduce the overall setup time. The thread 
> pool should be configurable and a good choice could be 
> "hbase.snapshot.thread.pool.max" that is also used in RestoreSnapshotHelper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645138#comment-16645138
 ] 

Ted Yu commented on HBASE-21285:


In TableSnapshotInputFormatImpl#getRegionSizes, snapshot visitor may encounter 
IOExceptiion.
{code}
+Map regionSizes = getRegionSizes(conf, fs, snapshotDir);
{code}
Please add protection against potential IOExceptiion.

As I mentioned above, if the additional computation takes non-trivial amount of 
time, we should consider introducing new config for the improvement.

> Enhanced TableSnapshotInputFormat to allow a size based splitting
> -
>
> Key: HBASE-21285
> URL: https://issues.apache.org/jira/browse/HBASE-21285
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21285.branch-1.4.001.patch
>
>
> Currently, all the splits generated by a snapshot are having length 0. Right 
> now, we have a configuration for the number of splits per region, but it's a 
> general one and not very helpful when the sizes for regions are really 
> different. The modification must be done in TableSnapshotInputFormatImpl 
> where the length must be computed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Allow WAL Provider to be specified by configuration without explicit enum in Providers

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645099#comment-16645099
 ] 

Ted Yu commented on HBASE-21247:


If there is no further review comment, I plan to integrate this.

> Allow WAL Provider to be specified by configuration without explicit enum in 
> Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue introduces additional config which allows the specification of new 
> WAL Provider through class name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645028#comment-16645028
 ] 

Hadoop QA commented on HBASE-21285:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1.4 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
51s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
35s{color} | {color:red} hbase-server: The patch generated 8 new + 14 unchanged 
- 0 fixed = 22 total (was 14) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
15s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}104m 
22s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:2cf636a |
| JIRA Issue | HBASE-21285 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943226/HBASE-21285.branch-1.4.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 6063d5bd3212 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 

[jira] [Commented] (HBASE-21286) Parallelize computeHDFSBlocksDistribution when getting splits of a HBaseSnapshot

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645021#comment-16645021
 ] 

Hadoop QA commented on HBASE-21286:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1.4 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
42s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
39s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
18s{color} | {color:red} hbase-server: The patch generated 8 new + 8 unchanged 
- 0 fixed = 16 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}103m 
50s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:2cf636a |
| JIRA Issue | HBASE-21286 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943228/HBASE-21286.branch-1.4.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  

[jira] [Commented] (HBASE-21285) Enhanced TableSnapshotInputFormat to allow a size based splitting

2018-10-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644985#comment-16644985
 ] 

Ted Yu commented on HBASE-21285:


Have you measured the duration for the {{getRegionSizes}} method on the 
snapshot(s) ?

Just wondering how much time would be spent on the extra computation.

Thanks

> Enhanced TableSnapshotInputFormat to allow a size based splitting
> -
>
> Key: HBASE-21285
> URL: https://issues.apache.org/jira/browse/HBASE-21285
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21285.branch-1.4.001.patch
>
>
> Currently, all the splits generated by a snapshot are having length 0. Right 
> now, we have a configuration for the number of splits per region, but it's a 
> general one and not very helpful when the sizes for regions are really 
> different. The modification must be done in TableSnapshotInputFormatImpl 
> where the length must be computed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21277) Prevent to add same table to two sync replication peer's config

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644936#comment-16644936
 ] 

Hudson commented on HBASE-21277:


Results for branch master
[build #536 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/536/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/536//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/536//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/536//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Prevent to add same table to two sync replication peer's config
> ---
>
> Key: HBASE-21277
> URL: https://issues.apache.org/jira/browse/HBASE-21277
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21277.master.001.patch, 
> HBASE-21277.master.002.patch, HBASE-21277.master.002.patch
>
>
> If a table in two sync replication peer's config, it need write wal to three 
> places: local dir and two remote dir. It is not allowed. Need to add check 
> when add sync replication peer or modify sync replication peer's config.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21251) Refactor RegionMover

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644912#comment-16644912
 ] 

Hudson commented on HBASE-21251:


Results for branch master
[build #535 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/535/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/535//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/535//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/535//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Refactor RegionMover
> 
>
> Key: HBASE-21251
> URL: https://issues.apache.org/jira/browse/HBASE-21251
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21251.master.001.patch, 
> HBASE-21251.master.002.patch, HBASE-21251.master.003.patch, 
> HBASE-21251.master.004.patch
>
>
> 1. Move connection and admin to RegionMover's member variables. No need 
> create connection many times.
> 2. use try-with-resource to reduce code
> 3. use ServerName instead of String
> 4. don't use Deprecated method
> 5. remove duplicate code
> ..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21280) Add anchors for each heading in UI

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644911#comment-16644911
 ] 

Hudson commented on HBASE-21280:


Results for branch master
[build #535 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/535/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/535//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/535//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/535//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add anchors for each heading in UI
> --
>
> Key: HBASE-21280
> URL: https://issues.apache.org/jira/browse/HBASE-21280
> Project: HBase
>  Issue Type: Bug
>  Components: UI, Usability
>Reporter: stack
>Assignee: stack
>Priority: Trivial
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21280.branch-2.1.001.patch, 
> HBASE-21280.branch-2.1.002.patch
>
>
> On larger clusters, its annoying having to scroll down on each refresh. 
> Anchors would help pin page to a section in UI (until our UI gets redone...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21256) Improve IntegrationTestBigLinkedList for testing huge data

2018-10-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644879#comment-16644879
 ] 

Duo Zhang commented on HBASE-21256:
---

Will commit tomorrow if no objections.

> Improve IntegrationTestBigLinkedList for testing huge data
> --
>
> Key: HBASE-21256
> URL: https://issues.apache.org/jira/browse/HBASE-21256
> Project: HBase
>  Issue Type: Improvement
>  Components: integration tests
>Affects Versions: 3.0.0
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21256-v1.patch, HBASE-21256-v10.patch, 
> HBASE-21256-v11.patch, HBASE-21256-v12.patch, HBASE-21256-v2.patch, 
> HBASE-21256-v3.patch, HBASE-21256-v4.patch, ITBLL-1.png, ITBLL-2.png
>
>
> Recently, I use ITBLL to test some features in our company. I have 
> encountered the following problems:
>   
>  1. Generator is too slow at the generating stage, the root cause is 
> SecureRandom. There is a global lock in SecureRandom( See the following 
> picture). I use Random instead of SecureRandom, and it could speed up this 
> stage(500% up with 20 mapper).  SecureRandom was brought by HBASE-13382, but 
> speaking of generating random bytes, in my opnion,
>  it is the same with Random.
> !ITBLL-1.png!
> 2. VerifyReducer have a cpu cost of 14% on format method. This is cause by 
> create keyString variable. However, keyString may never be used if test 
> result is correct.(and that's in most cases). Just delay creating keyString 
> can yield huge performance boost in verifing stage.
> !ITBLL-2.png!
> 3.Arguments check is needed, because there's constraint between arguments. If 
> we broken this constraint, we can not get a correct circular list.  
>   
>  4.Let big family value size could be configured.
>  
> 5.Avoid starting RS at backup master



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21286) Parallelize computeHDFSBlocksDistribution when getting splits of a HBaseSnapshot

2018-10-10 Thread Lavinia-Stefania Sirbu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavinia-Stefania Sirbu updated HBASE-21286:
---
Attachment: HBASE-21286.branch-1.4.001.patch
Status: Patch Available  (was: Open)

> Parallelize computeHDFSBlocksDistribution when getting splits of a 
> HBaseSnapshot
> 
>
> Key: HBASE-21286
> URL: https://issues.apache.org/jira/browse/HBASE-21286
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 1.4.0
>Reporter: Lavinia-Stefania Sirbu
>Priority: Minor
> Attachments: HBASE-21286.branch-1.4.001.patch
>
>
> Even if this step is called computeHDFSBlocksDistribution, this is executed 
> no matter the file system of the snapshot. For example, we have observed an 
> important slowness when we have a snapshot in s3 (~26k regions, 5column 
> families, 2 files per column family) the getsplits time is ~40min due to the 
> calls in s3 for listing the files to get the best locations.
> Parallelizing this operation can reduce the overall setup time. The thread 
> pool should be configurable and a good choice could be 
> "hbase.snapshot.thread.pool.max" that is also used in RestoreSnapshotHelper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >