[ 
https://issues.apache.org/jira/browse/HBASE-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-15984:
--------------------------------
    Release Note: 
In some particular deployments, the Replication code believes it has
reached EOF for a WAL prior to successfully parsing all bytes known to
exist in a cleanly closed file.

If an EOF is detected due to parsing or other errors while there are still 
unparsed bytes before the end-of-file trailer, we now reset the WAL to the very 
beginning and attempt a clean read-through. Because we will retry these 
failures indefinitely, two additional changes are made to help with diagnostics:

* On each retry attempt, a log message like the below will be emitted at the 
WARN level:
    
      Processing end of WAL file '{}'. At position {}, which is too far away
      from reported file length {}. Restarting WAL reading (see HBASE-15983
      for details).

*  additional metrics measure the use of this recovery mechanism. they are 
described in the reference guide.
          Status: Patch Available  (was: In Progress)

> Given failure to parse a given WAL that was closed cleanly, replay the WAL.
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-15984
>                 URL: https://issues.apache.org/jira/browse/HBASE-15984
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Replication
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>            Priority: Critical
>             Fix For: 2.0.0, 1.0.4, 1.4.0, 1.3.1, 0.98.22, 1.1.7, 1.2.4
>
>         Attachments: HBASE-15984.1.patch, HBASE-15984.2.patch
>
>
> subtask for a general work around for "underlying reader failed / is in a bad 
> state" just for the case where a WAL 1) was closed cleanly and 2) we can tell 
> that our current offset ought not be the end of parseable entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to