[
https://issues.apache.org/jira/browse/HBASE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849842#action_12849842
]
Jean-Daniel Cryans commented on HBASE-2337:
-------------------------------------------
Minor nits, some lines you added are over the 80 chars limit (like line 1273)
and some statements don't have a blank before conditions, like:
{code}
+ for(Path p : finishedFiles) {
{code}
Also I don't think this comment makes sense since you pass a default value:
{code}
+ // store corrupt logs for post-mortem analysis (empty string = discard)
+ final String corruptDir =
+ conf.get("hbase.regionserver.hlog.splitlog.corrupt.dir", ".corrupt");
{code}
I'll try this patch later this afternoon.
> log recovery: splitLog deletes old logs prematurely
> ---------------------------------------------------
>
> Key: HBASE-2337
> URL: https://issues.apache.org/jira/browse/HBASE-2337
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: Kannan Muthukkaruppan
> Assignee: Nicolas Spiegelberg
> Priority: Blocker
> Fix For: 0.20.4, 0.21.0
>
> Attachments: HBASE-2337-20.4.patch
>
>
> splitLog()'s purpose is to take a bunch of commit logs of a crashed RS and
> create per-region logs. splitLog() runs in the master. There are two cases
> where splitLog() might end up deleting an old log before actually creating
> (sync/closing) the newly created logs. If the master crashes in between
> deletion of the old log and creation of the new log, then edits could be lost
> irrecoverably.
> More specifically here are the two issues we (Nicolas, Aravind and I) noticed:
> Issue #1: The old logs are read one at a time. An in memory structure,
> logEntries (a map from region name to edits for the region), is populated.
> And the old logs are closed. Then the in-memory map is written out to per
> region files. Fix: We should move the file deletion to later.
> Issue #2: There is another little case. The per-region log file is written
> under the region directory (named oldlogfile.log or the constant
> HREGION_OLDLOGFILE_NAME). Before the master creates the file, it checks to
> see if there is already a file with that name, and if so, it renames it to
> oldlogfile.log.old, and then creates file oldlogfile.log again, and copies
> over the contents of oldlogfile.log.old to oldlogfile.log. It then proceeds
> to delete "oldlogfile.log.old", even though it hasn't closed/sync'ed
> "oldlogfile.log" yet.
> --
> I think we should be able to restructure the code such that all deletion of
> old logs happens *after* the new logs have been created (i.e. written to &
> closed).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.