[ 
https://issues.apache.org/jira/browse/HBASE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-3198:
--------------------------------------

    Summary: Log rolling archives files prematurely  (was: HLog periodic roll 
doesn't seem to care about MemStores)

So some digging revealed interesting things. When we print the "whose 
sequenceid is x", it's always smaller than the real one by 1 since in the code 
we do:

{code}
this.outputfiles.put(Long.valueOf(this.logSeqNum.get() - 1), oldFile);
{code}

It may have been right to do this at some point in the past, but now since 
rolling is async from appending it means that the current logSeqNum is in fact 
the last one in the log. It's wrong to -1. Then there's this:

{code}
    TreeSet<Long> sequenceNumbers =
    new TreeSet<Long>(this.outputfiles.headMap(
      (Long.valueOf(oldestOutstandingSeqNum.longValue() + 1L))).keySet());
{code}

Here we are getting the log files that we can delete since we know that their 
oldest edit's sequence number is still smaller than the oldest edit. I don't 
know why we're doing a +1L, since you don't really  want to delete log files 
that do contain it. It may be a "fix" to my previous finding, but it's still 
broken since as I showed when creating this jira rolling does remove logs with 
unflushed edits.

I'm changing the title of this jira to a more broad scope, as any log rolling 
is at risk of lowering data durability.

> Log rolling archives files prematurely
> --------------------------------------
>
>                 Key: HBASE-3198
>                 URL: https://issues.apache.org/jira/browse/HBASE-3198
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> From the mailing list, Erdem Agaoglu found a case where when an HLog gets 
> rolled from the periodic log roller and it gets archived even tho the region 
> (ROOT) still has edits in the MemStore. I did an experiment on a local empty 
> machine and it does look broken:
> {noformat}
> org.apache.hadoop.hbase.regionserver.LogRoller: Hlog roll period 6000ms 
> elapsed
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs 
> -- HDFS-200
> org.apache.hadoop.hbase.regionserver.wal.HLog: Roll 
> /hbase-89-su/.logs/hbasedev,60020,1288977933643/10.10.1.177%3A60020.1288977933829,
>  entries=1,
>  filesize=295. New hlog 
> /hbase-89-su/.logs/hbasedev,60020,1288977933643/10.10.1.177%3A60020.1288977943913
> org.apache.hadoop.hbase.regionserver.wal.HLog: Found 1 hlogs to remove  out 
> of total 1; oldest outstanding sequenceid is 270055 from region -ROOT-,,0
> org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file 
> /hbase-89-su/.logs/hbasedev,60020,1288977933643/10.10.1.177%3A60020.1288977933829
>  whose highest sequenceid is 270054 to 
> /hbase-89-su/.oldlogs/10.10.1.177%3A60020.1288977933829
> {noformat}
> Marking as Blocker and taking a deeper look.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to