[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157671#comment-13157671
 ] 

Jonathan Hsieh commented on HBASE-4862:
---------------------------------------

@chenhui

I have a question and a few nits. 

What happens if the .temp gets left behind without being renamed?

You might want to mention that hlogs files in progress (.temp file suffixed) 
are excluded here.
{code}
+        // After creating writer, simulate partial region's
+        // replayRecoveredEditsIfAny() which gets SplitEditFiles of this
+        // region,and delete them.
{code}

Also, probably want to update javadoc of getSplitEditFilesSorted.

Comment should probably be "most likely" instead of "mostly"
{code}
+    try{
+      logSplitter.splitLog();
+    } catch (IOException e) {
+      LOG.info(e);
+      Assert.fail("Throws IOException when spliting "
+          + "log, it is mostly because writing file does not "
+          + "exist which is caused by concurrent replayRecoveredEditsIfAny()");
+    }
+    if (fs.exists(corruptDir)) {
+      if (fs.listStatus(corruptDir).length > 0) {
+        Assert.fail("There are some corrupt logs, "
+            + "it is mostly caused by concurrent replayRecoveredEditsIfAny()");
+      }
+    }
+  }
{code}

                
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-4862
>                 URL: https://issues.apache.org/jira/browse/HBASE-4862
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.2
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.0, 0.94.0, 0.90.5
>
>         Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, 
> hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, 
> hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logs....However, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to