[
https://issues.apache.org/jira/browse/HBASE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646344#action_12646344
]
Jim Kellerman commented on HBASE-698:
-------------------------------------
There is a very simple fix if the master comes back up and knows a region
server is dead.
However, if the master dies, region servers hang around until the master comes
back up. Thus the master cannot know which HLogs to recover and which belong to
running region servers. ("recovering" a HLog from a running region server would
produce unpredictable results, most likely leading to data corruption).
Relying on hdfs lease timeouts on the log files is also not an option as the
lease timeout interval is too long for this purpose.
The master can therefore not recover any region server logs unless it knows
that region server is dead. This cannot be accomplished without Zookeeper
integration, which will monitor the region servers (and the regions they serve)
using ephemeral files. At that point, if the master dies and is restarted, it
will know which region servers are alive, which ones have died and all the
regions that are currently being served. Then it will know which region server
logs to recover and which ones can be ignored (because the region server
writing it is still alive).
> HLog recovery is not performed after master failure
> ---------------------------------------------------
>
> Key: HBASE-698
> URL: https://issues.apache.org/jira/browse/HBASE-698
> Project: Hadoop HBase
> Issue Type: Sub-task
> Components: master, regionserver
> Affects Versions: 0.1.2
> Reporter: Clint Morgan
> Assignee: Jim Kellerman
> Priority: Blocker
> Fix For: 0.19.0
>
>
> I have a local cluster running, and its logging to
> <hbase>/log_X.X.X.X_1213228101021_60020/
> Then I kill both master and regionserver, and restart. Looking through
> the logs I don't see anything about trying to recover from this hlog,
> it just creates a new hlog alongside the existing one (with a new
> startcode). The older hlog seems to be ignored, and the tables
> created in the inital session are all gone.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.