[
https://issues.apache.org/jira/browse/HBASE-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Busbey updated HBASE-15984:
--------------------------------
Attachment: HBASE-15984.1.patch
-01
- add TRACE level messages with file offsets to aid debugging
- check for cleanly closed files and offset when handing eof in replication
- if we can detect that things are not right with handling a WAL file, retry
A downside to this approach is that a corrupt WAL file that was cleanly closed
would cause us to loop until an operator can intervene. But I can't think of a
way to avoid that without introducing automated dataloss. To mitigate this,
there's a WARN when restarting happens that points to the parent issue.
> Given failure to parse a given WAL that was closed cleanly, replay the WAL.
> ---------------------------------------------------------------------------
>
> Key: HBASE-15984
> URL: https://issues.apache.org/jira/browse/HBASE-15984
> Project: HBase
> Issue Type: Sub-task
> Components: Replication
> Reporter: Sean Busbey
> Assignee: Sean Busbey
> Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.0.4, 1.4.0, 1.2.2, 0.98.20, 1.1.6
>
> Attachments: HBASE-15984.1.patch
>
>
> subtask for a general work around for "underlying reader failed / is in a bad
> state" just for the case where a WAL 1) was closed cleanly and 2) we can tell
> that our current offset ought not be the end of parseable entries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)