[jira] [Commented] (HBASE-20727) Persist FlushedSequenceId to speed up WAL split after cluster restart

Sean Busbey (JIRA) Thu, 14 Jun 2018 20:26:15 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-20727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513282#comment-16513282
 ]


Sean Busbey commented on HBASE-20727:
-------------------------------------

should this be stored in the WAL dir instead of in the root dir?

what happens if the master goes down while writing the file? looks like we'll 
get an IOException in {{loadLastFlushedSequenceIds}} and act as though the file 
doesn't exist?

what happens if the master dies slowly while writing the file and still has it 
open when the new master takes over as active? It looks like we will get an 
IOException in {{loadLastFlushedSequenceIds}}, again when the chore tries in 
{{persistRegionLastFlushedSequenceIds}}, and eventually just write a new one 
once the chore comes around and the lease has expired?

How about docs about this addition in the WAL recovery section of the ref guide?

{code}
+  public static final String PERSIST_FLUSHEDSEQUENCEID =
+      "hbase.master.persist.flushedsequenceid.enabled";
{code}

What are the conditions where we'd turn this off? Can we document that as well?

> Persist FlushedSequenceId to speed up WAL split after cluster restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-20727
>                 URL: https://issues.apache.org/jira/browse/HBASE-20727
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HBASE-20727.002.patch, HBASE-20727.003.patch, 
> HBASE-20727.patch
>
>
> We use flushedSequenceIdByRegion and storeFlushedSequenceIdsByRegion in 
> ServerManager to record the latest flushed seqids of regions and stores. So 
> during log split, we can use seqids stored in those maps to filter out the 
> edits which do not need to be replayed. But, those maps are not persisted. 
> After cluster restart or master restart, info of flushed seqids are all lost. 
> Here I offer a way to persist those info to HDFS, even if master restart, we 
> can still use those info to filter WAL edits and then to speed up replay.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20727) Persist FlushedSequenceId to speed up WAL split after cluster restart

Reply via email to