[jira] [Issue Comment Edited] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

Zhihong Yu (Issue Comment Edited) (JIRA) Fri, 20 Jan 2012 11:09:06 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190002#comment-13190002
 ]


Zhihong Yu edited comment on HBASE-5179 at 1/20/12 7:08 PM:
------------------------------------------------------------

Patch v17 for 0.90 passed unit tests.
Got a strange complaint about TestLruBlockCache. In 
org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.txt:
{code}
testBackgroundEvictionThread(org.apache.hadoop.hbase.io.hfile.TestLruBlockCache)
  Time elapsed: 3.157 sec  <<< FAILURE!
junit.framework.AssertionFailedError: null
  at junit.framework.Assert.fail(Assert.java:47)
  at junit.framework.Assert.assertTrue(Assert.java:20)
  at junit.framework.Assert.assertTrue(Assert.java:27)
  at 
org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.testBackgroundEvictionThread(TestLruBlockCache.java:58)
{code}
Looks like the following assertion didn't give eviction enough time to run:
{code}
      Thread.sleep(1000);
      assertTrue(n++ < 2);
{code}
Running TestLruBlockCache alone passed.
                
      was (Author: zhi...@ebaysf.com):
    Patch v17 for 0.90 passed unit tests.
Got a strange complaint about TestLruBlockCache. But in 
org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.txt:
{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.io.hfile.TestLruBlockCache
-------------------------------------------------------------------------------
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.152 sec
{code}
                  
> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5179
>                 URL: https://issues.apache.org/jira/browse/HBASE-5179
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.2
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.0, 0.94.0, 0.90.6
>
>         Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v2.patch, 5179-90v3.patch, 
> 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 
> 5179-90v8.patch, 5179-90v9.patch, 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 
> 5179-v3.txt, 5179-v4.txt, Errorlog, hbase-5179.patch, hbase-5179v10.patch, 
> hbase-5179v12.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

Reply via email to