[
https://issues.apache.org/jira/browse/HBASE-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Busbey updated HBASE-15984:
--------------------------------
Release Note:
In some particular deployments, the Replication code believes it has
reached EOF for a WAL prior to successfully parsing all bytes known to
exist in a cleanly closed file.
If an EOF is detected due to parsing or other errors while there are still
unparsed bytes before the end-of-file trailer, we now reset the WAL to the very
beginning and attempt a clean read-through. Because we will retry these
failures indefinitely, two additional changes are made to help with diagnostics:
* On each retry attempt, a log message like the below will be emitted at the
WARN level:
Processing end of WAL file '{}'. At position {}, which is too far away
from reported file length {}. Restarting WAL reading (see HBASE-15983
for details).
* additional metrics measure the use of this recovery mechanism. they are
described in the reference guide.
Status: Patch Available (was: In Progress)
> Given failure to parse a given WAL that was closed cleanly, replay the WAL.
> ---------------------------------------------------------------------------
>
> Key: HBASE-15984
> URL: https://issues.apache.org/jira/browse/HBASE-15984
> Project: HBase
> Issue Type: Sub-task
> Components: Replication
> Reporter: Sean Busbey
> Assignee: Sean Busbey
> Priority: Critical
> Fix For: 2.0.0, 1.0.4, 1.4.0, 1.3.1, 0.98.22, 1.1.7, 1.2.4
>
> Attachments: HBASE-15984.1.patch, HBASE-15984.2.patch
>
>
> subtask for a general work around for "underlying reader failed / is in a bad
> state" just for the case where a WAL 1) was closed cleanly and 2) we can tell
> that our current offset ought not be the end of parseable entries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)