[ 
https://issues.apache.org/jira/browse/HBASE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889296#action_12889296
 ] 

HBase Review Board commented on HBASE-2727:
-------------------------------------------

Message from: [email protected]

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/329/
-----------------------------------------------------------

Review request for hbase and Ryan Rawson.


Summary
-------

See notes made over in hbase-2727


This addresses bug hbase-2727.
    http://issues.apache.org/jira/browse/hbase-2727


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 40205c4 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 7044891 
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogMethods.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 
3fff2fa 
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 
9053d39 

Diff: http://review.hbase.org/r/329/diff


Testing
-------

All related tests seem to pass.  A few are failing for me but seem unrelated.  
Digging in while this review goes on.


Thanks,

stack




> Splits writing one file only is untenable; need dir of recovered edits 
> ordered by sequenceid.
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2727
>                 URL: https://issues.apache.org/jira/browse/HBASE-2727
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: 2727-v2.txt, 2727-v4.txt, 2727-v6.txt
>
>
> This issue comes of tlipcon doing a bit of human unit testing.  His 
> speculation is:
> Let a region X deploy to server A.  Server A opens the region, then closes it.
> Let region X now deploy to server B.  Server B now crashes.
> Both server A and server B now have edits for region X in their WALs.
> The processing of server crashes is currently sequential. 
> If server A crashes before server B, server A will write out a file of 
> recovered edits for region X but region X was not deployed on server A so, 
> the file will just sit there unused.  The processing of server B crash will 
> overwrite the recovered edits file written by the split of server A wal.  
> This is ok.
> But if somehow, server B processing is done before server A's, then 
> interesting issues will likely arise; in the main, there is danger that the 
> server B's recovered edits could be overwritten.
> Another issue comes up in the review of hbase-1025.  During the replay of 
> edits on region deploy, if the hosting regionserver crashes before we have 
> processed all of the recovered edits, we could lose some (the recovery of the 
> regionserver that is replaying the edits could overwrite the log of edits 
> only partially replayed).
> Discussing up on IRC, whats needed is a directory of edits to replay ordered 
> by sequenceid.  On recovery, we play the oldest through to the newest 
> removing the edits only on successfully replay.
> Making blocker on 0.21 since this is a correctness issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to