[
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660801#comment-16660801
]
Hudson commented on HBASE-20973:
--------------------------------
Results for branch master
[build #564 on
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/564/]: (x)
*{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://builds.apache.org/job/HBase%20Nightly/job/master/564//General_Nightly_Build_Report/]
(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://builds.apache.org/job/HBase%20Nightly/job/master/564//JDK8_Nightly_Build_Report_(Hadoop2)/]
(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://builds.apache.org/job/HBase%20Nightly/job/master/564//JDK8_Nightly_Build_Report_(Hadoop3)/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> ArrayIndexOutOfBoundsException when rolling back procedure
> ----------------------------------------------------------
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
> Issue Type: Sub-task
> Components: amv2
> Affects Versions: 2.1.0, 2.0.1
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch,
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root
> procedure(ModifyTableProcedure in this case) rolled back, a
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159):
> CODE-BUG: Uncaught runtime exception for pid=5973,
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973,
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED;
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN [PEWorker-8]
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive
> lock held by ModifyTableProcedure was never released. All the procedure
> against this table were blocked. Until the master restarted, and since the
> lock info for the procedure won't be restored, the other procedures can go
> again, it is quite embarrassing that a bug save us...(this bug will be fixed
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what
> happens, the table's exclusive lock can always be relased.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)