[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
[ https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368718#comment-16368718 ] stack commented on HBASE-20015: --- Hmm.. These have dropped off the top list in the flakies dashboard but still on the bottom set, as though the patch were not in place... Looking. > TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey > - > > Key: HBASE-20015 > URL: https://issues.apache.org/jira/browse/HBASE-20015 > Project: HBase > Issue Type: Sub-task > Components: flakey >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-20015.branch-2.001.patch > > > MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can > run in a test mode such that it kills the Procedure before it can persist > state and it does this repeatedly to shake out areas where Procedures may not > be preserving all needed state at each Procedural step. The kill will cause > the Procedure to 'fail'. It'll then run the rollback procedure. The > MergeRegionProcedure is not able to roll back the last few steps of Merge > It throws an UnsupportedException (the hope was that the missing steps would > be filled in ... but they are hard to complete in that they themselves are > stepped). > So > Well it turns out that Split has a mechanism where it will not fail the > Procedure if gets to a stage from which it cannot rollback. Instead, it will > just retry and keep retrying till it succeeds eventually. Merge has this > facility half-implemented. Merge tests are therefore flakey. They do stuff > like this: > {code} > 2018-02-17 04:04:02,999 WARN [PEWorker-1] > assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step > MERGE_TABLE_REGIONS_UPDATE_META for merging the regions > [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table > testRollbackAndDoubleExecution > java.lang.UnsupportedOperationException: pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199) > at > org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) > 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): > CODE-BUG: Uncaught runtime exception for pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false > java.lang.UnsupportedOperationException: pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rol
[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
[ https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368395#comment-16368395 ] Hudson commented on HBASE-20015: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4603 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4603/]) HBASE-20015 TestMergeTableRegionsProcedure and (stack: rev f3ff55a2b4bb7a8b4980fdbb5b1f7a8d033631f3) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java > TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey > - > > Key: HBASE-20015 > URL: https://issues.apache.org/jira/browse/HBASE-20015 > Project: HBase > Issue Type: Sub-task > Components: flakey >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-20015.branch-2.001.patch > > > MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can > run in a test mode such that it kills the Procedure before it can persist > state and it does this repeatedly to shake out areas where Procedures may not > be preserving all needed state at each Procedural step. The kill will cause > the Procedure to 'fail'. It'll then run the rollback procedure. The > MergeRegionProcedure is not able to roll back the last few steps of Merge > It throws an UnsupportedException (the hope was that the missing steps would > be filled in ... but they are hard to complete in that they themselves are > stepped). > So > Well it turns out that Split has a mechanism where it will not fail the > Procedure if gets to a stage from which it cannot rollback. Instead, it will > just retry and keep retrying till it succeeds eventually. Merge has this > facility half-implemented. Merge tests are therefore flakey. They do stuff > like this: > {code} > 2018-02-17 04:04:02,999 WARN [PEWorker-1] > assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step > MERGE_TABLE_REGIONS_UPDATE_META for merging the regions > [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table > testRollbackAndDoubleExecution > java.lang.UnsupportedOperationException: pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199) > at > org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) > 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): > CODE-BUG: Uncaught runtime exception for pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false > java.lang.UnsupportedOperationException: pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c16
[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
[ https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368335#comment-16368335 ] stack commented on HBASE-20015: --- Pushed to master and branch-2 after fixing checkstyle. Leaving open to see if this makes a difference in our test runs. > TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey > - > > Key: HBASE-20015 > URL: https://issues.apache.org/jira/browse/HBASE-20015 > Project: HBase > Issue Type: Sub-task > Components: flakey >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-20015.branch-2.001.patch > > > MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can > run in a test mode such that it kills the Procedure before it can persist > state and it does this repeatedly to shake out areas where Procedures may not > be preserving all needed state at each Procedural step. The kill will cause > the Procedure to 'fail'. It'll then run the rollback procedure. The > MergeRegionProcedure is not able to roll back the last few steps of Merge > It throws an UnsupportedException (the hope was that the missing steps would > be filled in ... but they are hard to complete in that they themselves are > stepped). > So > Well it turns out that Split has a mechanism where it will not fail the > Procedure if gets to a stage from which it cannot rollback. Instead, it will > just retry and keep retrying till it succeeds eventually. Merge has this > facility half-implemented. Merge tests are therefore flakey. They do stuff > like this: > {code} > 2018-02-17 04:04:02,999 WARN [PEWorker-1] > assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step > MERGE_TABLE_REGIONS_UPDATE_META for merging the regions > [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table > testRollbackAndDoubleExecution > java.lang.UnsupportedOperationException: pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199) > at > org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) > 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): > CODE-BUG: Uncaught runtime exception for pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false > java.lang.UnsupportedOperationException: pid=44, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > abort requested; MergeTableRegionsProcedure > table=testRollbackAndDoubleExecution, > regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], > forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291) > at > org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegio
[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
[ https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368160#comment-16368160 ] Hadoop QA commented on HBASE-20015: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 16s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 2s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 9s{color} | {color:red} hbase-server: The patch generated 1 new + 148 unchanged - 0 fixed = 149 total (was 148) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 14m 57s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}103m 29s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}134m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-20015 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12911002/HBASE-20015.branch-2.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7722d4c1beb3 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 8be0696320 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11557/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11557/testReport/ | | Max. process+thread count | 5423 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console