[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478340#comment-16478340 ] Sergey Soldatov commented on HBASE-20202: - bq. On the the new master, it will notice this and re-run the step that was doing the close step... That is what should happen sir. Correct. It happens almost in that way. But in my case it's a bit more complicated : We have modified a table (chaos monkey changed bloom filter) and when ModifyTableProcedure was executed, and master was killed during execution of UnassignProcedure when one of the regions was closing New master had the following list of not successful procedures during the startup: UnassignProcedure pid=766, ppid=754 MoveRegionProcedure pid=754, ppid=749 ModifyTableProcedure pid=749 Once master got online, UnassignProcedure started: {noformat} 2018-05-15 21:48:23,787 INFO [PEWorker-8] assignment.RegionStateStore: pid=766 updating hbase:meta row=92e0d39ee7e6d19566c393bae58ab5c0, regionState=CLOSING 2018-05-15 21:48:23,820 INFO [PEWorker-8] assignment.RegionTransitionProcedure: Dispatch pid=766, ppid=754, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure table=IntegrationTestBigLinkedList, region=92e0d39ee7e6d19566c393bae58ab5c0, server=ctr-e138-15181 {noformat} Not sure what exactly triggered that (any retry logic from the client? I don't see any logs from the chaos monkey for that action), but suddenly master is trying to execute an another ModifyTableProcedure. And it started from the state MODIFY_TABLE_REOPEN_ALL_REGIONS which tries to reopen all regions using MoveRegionProcedure. And because our region is not online, it fails due that checkOnline: {noformat} 2018-05-15 21:48:29,684 INFO [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=2] bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native 2018-05-15 21:48:29,685 INFO [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=2] compress.CodecPool: Got brand-new compressor [.bz2] 2018-05-15 21:48:29,744 INFO [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=2] compress.CodecPool: Got brand-new compressor [.lz4] 2018-05-15 21:48:29,788 INFO [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=2] master.HMaster: Client=hbase//172.27.86.70 modify IntegrationTestBigLinkedList 2018-05-15 21:48:29,902 DEBUG [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=2] procedure2.ProcedureExecutor: Stored pid=790, state=RUNNABLE:MODIFY_TABLE_PREPARE; ModifyTableProcedure table=IntegrationTestBigLinkedList 2018-05-15 21:48:29,969 DEBUG [PEWorker-4] util.FSTableDescriptors: Wrote into hdfs://mycluster/apps/hbase/data/data/default/IntegrationTestBigLinkedList/.tabledesc/.tableinfo.06 2018-05-15 21:48:29,972 DEBUG [PEWorker-4] util.FSTableDescriptors: Deleted hdfs://mycluster/apps/hbase/data/data/default/IntegrationTestBigLinkedList/.tabledesc/.tableinfo.05 2018-05-15 21:48:29,972 INFO [PEWorker-4] util.FSTableDescriptors: Updated tableinfo=hdfs://mycluster/apps/hbase/data/data/default/IntegrationTestBigLinkedList/.tabledesc/.tableinfo.06 2018-05-15 21:48:30,004 WARN [PEWorker-4] procedure.ModifyTableProcedure: Retriable error trying to modify table=IntegrationTestBigLinkedList (in state=MODIFY_TABLE_REOPEN_ALL_REGIONS) org.apache.hadoop.hbase.client.DoNotRetryRegionException: 92e0d39ee7e6d19566c393bae58ab5c0 is not OPEN at org.apache.hadoop.hbase.master.procedure.AbstractStateMachineTableProcedure.checkOnline(AbstractStateMachineTableProcedure.java:193) at org.apache.hadoop.hbase.master.assignment.MoveRegionProcedure.(MoveRegionProcedure.java:67) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:767) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createReopenProcedures(AssignmentManager.java:705) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:128) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:50) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1472) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1240) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760) 2018-05-15 21:48:30,027 WARN [PEWorker-4] procedure.ModifyTableProcedure: Retriable error trying to mod
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477995#comment-16477995 ] stack commented on HBASE-20202: --- Thanks for asking [~sergey.soldatov] bq. What will happen if master crashed during UnassignProcedure execution and region went to close state? Its a basic tenet of AMv2 that a proc completes; no pre-emption by another (That'll be the provenance of an hbck2). To answer your question explicitly, if master crashes toward of unassignprocedure, then the unassign is not finished. On new master, it will notice this and re-run the step that was doing the close step That is what should happen sir. bq. Recently we had a case when new master was trying to recover from the crash and got stuck because the region was in CLOSED state and this check prevented to create the procedure. Yeah. If the UP is stuck... the Move won't work. Fail fast is better, no? > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch, HBASE-20202.branch-2.003.patch, > HBASE-20202.branch-2.003.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.h
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477967#comment-16477967 ] Sergey Soldatov commented on HBASE-20202: - [~stack] In your fix you have added a check that a region is open to the MoveRegionProcedure constructor: {noformat} public MoveRegionProcedure(final MasterProcedureEnv env, final RegionPlan plan) throws HBaseIOException { super(env, plan.getRegionInfo()); this.plan = plan; preflightChecks(env, true); checkOnline(env, plan.getRegionInfo()); } {noformat} What will happen if master crashed during UnassignProcedure execution and region went to close state? Recently we had a case when new master was trying to recover from the crash and got stuck because the region was in CLOSED state and this check prevented to create the procedure. FYI [~elserj] > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch, HBASE-20202.branch-2.003.patch, > HBASE-20202.branch-2.003.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerExceptio
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416670#comment-16416670 ] Hudson commented on HBASE-20202: Results for branch HBASE-19064 [build #77 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/77/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/77//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/77//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/77//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch, HBASE-20202.branch-2.003.patch, > HBASE-20202.branch-2.003.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; Unassign
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403764#comment-16403764 ] Hadoop QA commented on HBASE-20202: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 1s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 29s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 30s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 58s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}101m 23s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}134m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-20202 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12915022/HBASE-20202.branch-2.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 0f1e21d9038e 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 03e7b78260 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12007/testReport/ | | Max. process+thread count | 5068 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Bu
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403703#comment-16403703 ] stack commented on HBASE-20202: --- Retry. Test passes locally. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch, HBASE-20202.branch-2.003.patch, > HBASE-20202.branch-2.003.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerException > > > >at > java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:934) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:962) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403698#comment-16403698 ] Hadoop QA commented on HBASE-20202: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 1s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 29s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 35s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 7s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 33s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}136m 47s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.TestReplicationDroppedTables | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-20202 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12915017/HBASE-20202.branch-2.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 352f14884291 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 03e7b78260 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/12006/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403546#comment-16403546 ] stack commented on HBASE-20202: --- .003 is an addendum. The prepare step is not being called at all. Fix. TODO: Add test failing the prepare and play with setting lock for life of procedure. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch, HBASE-20202.branch-2.003.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerException > > > >at > java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:934) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403483#comment-16403483 ] stack commented on HBASE-20202: --- Thanks for taking a look [~Apache9] bq. And since we do not set holdLock to true for MoveRegionProcedure This is wrong. Let me fix. I added prepare because thats easy to reason about come rollback time. In general our steps are too macro (e.g. HBASE-20103). Filed new issue. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerException > > > >at > java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStat
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403429#comment-16403429 ] Hudson commented on HBASE-20202: Results for branch master [build #264 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/264/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/264//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/264//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/264//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403421#comment-16403421 ] Duo Zhang commented on HBASE-20202: --- Sorry a bit late, but is it necessary to have an extra prepare state? Just check it in the MOVE_REGION_UNASSIGN state is enough? And since we do not set holdLock to true for MoveRegionProcedure, there could be holes between the execution of different states where other procedures may grab the region lock since we schedule sub procedure and the parent procedure will be suspended and release the region lock? Thanks. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerException > > > >at > java.util.concu
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403096#comment-16403096 ] Hudson commented on HBASE-20202: Results for branch branch-2 [build #493 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/493/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/493//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/493//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/493//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f78
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402541#comment-16402541 ] Hudson commented on HBASE-20202: Results for branch branch-2.0 [build #47 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/47/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/47//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/47//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/47//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401558#comment-16401558 ] Hadoop QA commented on HBASE-20202: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 30s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 2s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 52s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 54s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 47s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 13s{color} | {color:red} hbase-server: The patch generated 1 new + 209 unchanged - 0 fixed = 210 total (was 209) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 4s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}123m 36s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}198m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-20202 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12914815/HBASE-20202.branch-2.002.patch | | Optional Tests | asflicense javac javadoc unit f
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401448#comment-16401448 ] stack commented on HBASE-20202: --- .002 Fix tests that were not expecting the new fast-fail > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20202.branch-2.001.patch, > HBASE-20202.branch-2.002.patch > > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerException > > > >at > java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:934) > at > org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:962) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401309#comment-16401309 ] Hadoop QA commented on HBASE-20202: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 11s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 57s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 11s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 30s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 59s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 0s{color} | {color:red} hbase-server: The patch generated 4 new + 183 unchanged - 0 fixed = 187 total (was 183) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 32s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 4m 7s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.5. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 4m 40s{color} | {color:red} The patch causes 10 errors with Hadoop v2.7.4. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 5m 15s{color} | {color:red} The patch causes 10 errors with Hadoop v3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 8s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}160m 5s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 58s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}219m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests ||
[jira] [Commented] (HBASE-20202) [AMv2] Don't move region if its a split parent or offlined
[ https://issues.apache.org/jira/browse/HBASE-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401073#comment-16401073 ] stack commented on HBASE-20202: --- HBASE-20178 added fast-fail on construction of Procedures where they'd fast-fail on Construction if table was disabled or if server or cluster was doing down. This issue follows on. It adds region-level checks to region procedures into the Procedure Constructor. No point doing a move if a region is offlined for example (e.g. the log sample from above). Also added some recheck of context to prepare steps for procedures. Move didn't have a prepare. When prepare runs, region is locked, owned by this Procedure. At prepare step for Move we were not checking Region was online. > [AMv2] Don't move region if its a split parent or offlined > -- > > Key: HBASE-20202 > URL: https://issues.apache.org/jira/browse/HBASE-20202 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.0.0-beta-2 >Reporter: stack >Assignee: stack >Priority: Major > > Found this one running ITBLLs. We'd just finished splitting a region > 91655de06786f786b0ee9c51280e1ee6 and then a move for it comes in. The move > fails in an interesting way. The location has been removed from the > regionnode kept by the Master. HBASE-20178 adds macro checks on context. Need > to add a few checks to the likes of MoveRegionProcedure so we don't try to > move an offlined/split parent. > {code} > 2018-03-14 10:21:45,678 INFO [PEWorker-2] procedure2.ProcedureExecutor: > Finished pid=3177, state=SUCCESS; SplitTableRegionProcedure > table=IntegrationTestBigLinkedList, parent=91655de06786f786b0ee9c51280e1ee6, > daughterA=b67bf6b79eaa83de788b0519f782ce8e, > daughterB=99cf6ddb38cad08e3aa7635b6cac2e7b in 10.0210sec > 2018-03-14 10:21:45,679 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0. > 2018-03-14 10:21:45,680 INFO [PEWorker-5] > procedure.MasterProcedureScheduler: pid=3187, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6., > source=ve0530.halxg.cloudera.com,16020,1521007509855, > destination=ve0528.halxg.cloudera.com,16020,1521047890874, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,680 INFO [PEWorker-15] assignment.RegionStateStore: > pid=3194 updating hbase:meta > row=IntegrationTestBigLinkedList,\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xA0,1521047891276.af198ca64b196fb3d2f5b3e815b2dad0., > regionState=CLOSING > 2018-03-14 10:21:45,680 INFO [PEWorker-5] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855}] > 2018-03-14 10:21:45,683 INFO [PEWorker-15] > assignment.RegionTransitionProcedure: Dispatch pid=3194, ppid=3193, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=af198ca64b196fb3d2f5b3e815b2dad0, > server=ve0530.halxg.cloudera.com,16020,1521007509855; rit=CLOSING, > location=ve0530.halxg.cloudera.com,16020,1521007509855 > 2018-03-14 10:21:45,752 INFO [PEWorker-15] > procedure.MasterProcedureScheduler: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855, > IntegrationTestBigLinkedList,\x0C0\xC3\x0C0\xC3\x0C0,1521045713137.91655de06786f786b0ee9c51280e1ee6. > 2018-03-14 10:21:45,753 ERROR [PEWorker-15] procedure2.ProcedureExecutor: > CODE-BUG: Uncaught runtime exception: pid=3195, ppid=3187, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList, region=91655de06786f786b0ee9c51280e1ee6, > server=ve0530.halxg.cloudera.com,16020,1521007509855 > java.lang.NullPointerException > > > >