[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156991#comment-13156991
 ] 

chunhui shen commented on HBASE-4862:
-------------------------------------

@Ted @Todd

I'm sorry my explanation is not clear.
I think I should descibe the detailed case first.

In the whole following process , client's putting data to region C.
1.Sucessfully move region C from server A to server B,
At the moment,there is log entry about region C in both server A's log file and 
server B's log file

2.kill server A and server B,

3.restart server B,
Now, mastet start serverShutdownHanlder for server B, and assign the region C 
to server D

4,Before region C is opend on the server D,restart server A
Now,mastet start serverShutdownHanlder for server A, and split server A's log 
file.
Because there is log entry about region C in server A's log file (why? see 1), 
split hlog thread would create a file F in the region C's recovered.edits 
directory.

5.In region C opening process, it will execute replayRecoveredEdits(),and then 
delete file F.

6.Therefore,in the 4, it throws IO Exception that file F not exists, and cause 
stopping parse the current  server A's hlog file, however, other data in this 
server A's hlog file lossed

The posted region server log is server B's log, and it is doing 
replayRecoveredEditsIfAny(). Although it prints failed delete of  file 
recovered.edits/0000000013156791680, but  in fact this file has been deleted, 
and master throws file not exist exception :
2011-11-16 11:50:13,037 FATAL 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got while 
writing log entry to log 
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: 
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680
 File does not exist.
 
I'm not sure whether you are clear now, waiting for your question.

Thanks!


                
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-4862
>                 URL: https://issues.apache.org/jira/browse/HBASE-4862
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.2
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0, 0.94.0, 0.90.5
>
>         Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logs....However, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to