[jira] [Commented] (HBASE-10464) Race condition during RS shutdown that could cause data loss

stack (JIRA) Thu, 13 Feb 2014 12:48:04 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900713#comment-13900713
 ]


stack commented on HBASE-10464:
-------------------------------

I took a look and we don't have this issue.  We spawn a thread to open regions. 
 The open handler already has this check for stopping or stopped regionserver 
host before we add region to online regions list.  We should be good.

> Race condition during RS shutdown that could cause data loss
> ------------------------------------------------------------
>
>                 Key: HBASE-10464
>                 URL: https://issues.apache.org/jira/browse/HBASE-10464
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.89-fb
>            Reporter: Yunfan Zhong
>            Priority: Critical
>             Fix For: 0.89-fb
>
>         Attachments: D1120497.diff
>
>
> Bug scenario (T* are timestamps, say T1 < T2 < ... < Tn):
> 1. Master assigns a region to RS at T1
> 2. RS works on opening the region during T1 to T3
> 3. In the mean time of opening the region, RS starts to shut down at T2, and 
> dfs client is closed at T5.
> 4. Regions owned by the RS get closed as a step of RS shutdown except that 
> the newly opened region is online during T3 to T5 and holds some mutations in 
> memory after possible last flush T4.
> 5. Since master thinks RS has a clean shutdown, there is no log splitting. 
> The HLog was moved to old logs directory naturally.
> 6. Mutations in memory between T4 to T5 (if T4 does not exist, T3 to T5) are 
> not flushed. They only exist in WAL if it is turned on.
> Fix is to prevent region opening from succeeding when the RS is shutting down.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10464) Race condition during RS shutdown that could cause data loss

Reply via email to