[ 
https://issues.apache.org/jira/browse/HBASE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-2727:
-------------------------

    Attachment: 2727-v6.txt

Added tests to demonstrate new facility whereby we can have more than one 
recovered edits file and that we'll replay the edits in the right order in the 
face of multiple edit files.

Regards the scenarios from the description above:

For 1, its highly unlikely but if for some reason we somehow process server 
shutdown B before we process server shutdown A and somehow, the assignment of 
the region in question does not happen until AFTER both server shutdowns have 
been processed, now its the case that wal splits will not overwrite since 
they'll be differently named -- named for the first sequenceid in the file -- 
and secondly, on replay of the edits on region deploy, we'll replay the edits 
from oldest to newest.

For 2, if we crash during replay of split edits, they'll be in place next time 
the region is deployed; we do not remove split replay edits until AFTER we've 
played them all and a flush has completed.

Here is commit message that gives overview on changes:

The replay of recovered edits has been changed again.  We no longer replay by 
calling Region#get and Region#delete and no longer add the replays to the RS 
WAL.  Instead we just add them to the memstore as we used to keeping account of 
the region memstore size.  If the memstore grows too large, we'll flush -- but 
NOT by using the general flush mechanism.  Instead we'll flush inline using 
sequenceids that make sense in the current replay context -- the sequenceids 
are from the regions storefiles and from its split recovered edits, NOT those 
of the hosting regionserver/hlog.  We need to do this to avoid case where the 
hosting regionserver/hlog sequenceid is somehow in excess of ours.    We don't 
want to add a storefile that has an inflated sequenceid in case we crash 
between the flush and the completion of the replay of edits (We'll miss edits 
on the second replay if we have a storefile w/ an excessive sequenceid).

Did other cleanup in HRegion.  We don't need to monitor minimal sequenceids in 
families.  That was silly.

Added ability to replay one or more recovered edits files.

In HLog, writes splts into a subdir of the region named recovered.edits rather 
than to a file named recovered.edits.

Added tests too.

> Splits writing one file only is untenable; need dir of recovered edits 
> ordered by sequenceid.
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2727
>                 URL: https://issues.apache.org/jira/browse/HBASE-2727
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: 2727-v2.txt, 2727-v4.txt, 2727-v6.txt
>
>
> This issue comes of tlipcon doing a bit of human unit testing.  His 
> speculation is:
> Let a region X deploy to server A.  Server A opens the region, then closes it.
> Let region X now deploy to server B.  Server B now crashes.
> Both server A and server B now have edits for region X in their WALs.
> The processing of server crashes is currently sequential. 
> If server A crashes before server B, server A will write out a file of 
> recovered edits for region X but region X was not deployed on server A so, 
> the file will just sit there unused.  The processing of server B crash will 
> overwrite the recovered edits file written by the split of server A wal.  
> This is ok.
> But if somehow, server B processing is done before server A's, then 
> interesting issues will likely arise; in the main, there is danger that the 
> server B's recovered edits could be overwritten.
> Another issue comes up in the review of hbase-1025.  During the replay of 
> edits on region deploy, if the hosting regionserver crashes before we have 
> processed all of the recovered edits, we could lose some (the recovery of the 
> regionserver that is replaying the edits could overwrite the log of edits 
> only partially replayed).
> Discussing up on IRC, whats needed is a directory of edits to replay ordered 
> by sequenceid.  On recovery, we play the oldest through to the newest 
> removing the edits only on successfully replay.
> Making blocker on 0.21 since this is a correctness issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to