[jira] [Commented] (HBASE-16874) Potential NPE from ProcedureExecutor#stop()

Matteo Bertozzi (JIRA) Tue, 18 Oct 2016 14:43:29 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586746#comment-15586746
 ]


Matteo Bertozzi commented on HBASE-16874:
-----------------------------------------

Looks like in this test we forgot to set the number of executors thread to 1.
When we inject failures with the setToggleKillBeforeStoreUpdate() we have the 
assumption that there is only one executor. In this case we have multiple 
executor running and toggling the flag and killing the executor when we are 
restarting it on the other side. 
{noformat}
2016-10-18 18:55:30,857 INFO  [ProcedureExecutorWorker-5] 
procedure.ServerCrashProcedure(204): Start processing crashed 
priapus.apache.org,38407,1476816438880
2016-10-18 18:55:30,857 WARN  [ProcedureExecutorWorker-5] 
procedure2.ProcedureExecutor$Testing(92): Toggle Kill before store update to: 
true
Exception in thread "ProcedureExecutorWorker-5" java.lang.RuntimeException: the 
store must be running before inserting data
        at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:542)
{noformat}

> Potential NPE from ProcedureExecutor#stop()
> -------------------------------------------
>
>                 Key: HBASE-16874
>                 URL: https://issues.apache.org/jira/browse/HBASE-16874
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>         Attachments: 16874.v1.txt, HBASE-16874-v0.patch
>
>
> When examining failed test :
> https://builds.apache.org/job/HBase-TRUNK_matrix/lastCompletedBuild/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/testReport/org.apache.hadoop.hbase.master.procedure/TestMasterFailoverWithProcedures/org_apache_hadoop_hbase_master_procedure_TestMasterFailoverWithProcedures/
> I noticed the following:
> {code}
> 2016-10-18 18:47:39,313 INFO  [Time-limited test] 
> procedure.TestMasterFailoverWithProcedures(306): Restart 2 exec state: 
> TRUNCATE_TABLE_CLEAR_FS_LAYOUT
> Exception in thread "ProcedureExecutorWorker-1" java.lang.NullPointerException
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.stop(ProcedureExecutor.java:533)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1197)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:959)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$700(ProcedureExecutor.java:73)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1405)
> {code}
> This seems to be the result of race between stop() and join() methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16874) Potential NPE from ProcedureExecutor#stop()

Reply via email to