[jira] [Commented] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M

stack (JIRA) Tue, 07 Jul 2015 22:17:49 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617997#comment-14617997
 ]


stack commented on HBASE-14028:
-------------------------------

I have been playing more with this. Losing data is pretty easy to do. Trying to 
find why the end of a WAL goes missing during replay; there is not enough info 
to debug and it is a little tough to trace where we're at at any one time. 
Trying to back fill.

> DistributedLogReplay drops edits when ITBLL 125M
> ------------------------------------------------
>
>                 Key: HBASE-14028
>                 URL: https://issues.apache.org/jira/browse/HBASE-14028
>             Project: HBase
>          Issue Type: Bug
>          Components: Recovery
>    Affects Versions: 1.2.0
>            Reporter: stack
>
> Testing DLR before 1.2.0RC gets cut, we are dropping edits.
> Issue seems to be around replay into a deployed region that is on a server 
> that dies before all edits have finished replaying. Logging is sparse on 
> sequenceid accounting so can't tell for sure how it is happening (and if our 
> now accounting by Store is messing up DLR). Digging.
> I notice also that DLR does not refresh its cache of region location on error 
> -- it just keeps trying till whole WAL fails.... 8 retries...about 30 
> seconds. We could do a bit of refactor and have the replay find region in new 
> location if moved during DLR replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M

Reply via email to