[ https://issues.apache.org/jira/browse/HBASE-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Busbey updated HBASE-15984: -------------------------------- Release Note: In some particular deployments, the Replication code believes it has reached EOF for a WAL prior to successfully parsing all bytes known to exist in a cleanly closed file. If an EOF is detected due to parsing or other errors while there are still unparsed bytes before the end-of-file trailer, we now reset the WAL to the very beginning and attempt a clean read-through. Because we will retry these failures indefinitely, two additional changes are made to help with diagnostics: * On each retry attempt, a log message like the below will be emitted at the WARN level: Processing end of WAL file '{}'. At position {}, which is too far away from reported file length {}. Restarting WAL reading (see HBASE-15983 for details). * additional metrics measure the use of this recovery mechanism. they are described in the reference guide. Status: Patch Available (was: In Progress) > Given failure to parse a given WAL that was closed cleanly, replay the WAL. > --------------------------------------------------------------------------- > > Key: HBASE-15984 > URL: https://issues.apache.org/jira/browse/HBASE-15984 > Project: HBase > Issue Type: Sub-task > Components: Replication > Reporter: Sean Busbey > Assignee: Sean Busbey > Priority: Critical > Fix For: 2.0.0, 1.0.4, 1.4.0, 1.3.1, 0.98.22, 1.1.7, 1.2.4 > > Attachments: HBASE-15984.1.patch, HBASE-15984.2.patch > > > subtask for a general work around for "underlying reader failed / is in a bad > state" just for the case where a WAL 1) was closed cleanly and 2) we can tell > that our current offset ought not be the end of parseable entries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)