[
https://issues.apache.org/jira/browse/HBASE-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926191#comment-13926191
]
Himanshu Vashishtha commented on HBASE-10714:
---------------------------------------------
One issue I see is SyncRunner may be invoking the writer.sync() call while log
rolling is replacing (and closing) the writer concurrently. This could happen
because we have > 1 SyncRunner.
Let's say LogRolling issues a safePoint request in the replaceWriter method,
with a "marker" SyncFuture with sequenceId 150. The five SyncRunners could be
syncing their different sequenceIds (for sake of simplicity, assume SyncRunner1
is syncing sequenceId from 101-110, SyncRunner2 is doing from 111-120, and so
on), and SyncRunner5 is doing 141-150. If SyncRunner5 happen to finish before
other SyncRunner(s), it would update the highestSyncedSequence to 150, and
replaceWriter would start replacing the writer instance underneath. But, other
SyncRunners could be in the middle of their sync() call, and they would see the
writer being closed in the middle of the call.
I am thinking if we should ensure that attainSafePoint means "all" the
SyncRunners are done with their call, and not just relying on the
highestSyncedSequence.
> SyncFuture hangs when sequence is 0
> -----------------------------------
>
> Key: HBASE-10714
> URL: https://issues.apache.org/jira/browse/HBASE-10714
> Project: HBase
> Issue Type: Bug
> Components: wal
> Reporter: Jimmy Xiang
> Assignee: Jimmy Xiang
> Fix For: 0.99.0
>
> Attachments: hbase-10714.patch
>
>
> In SyncFuture, NOT_DONE = 0. The initial value of the ringBuffer is -1. So
> ringBuffer.next() gives 0 for the first call. If we create a SyncFuture with
> sequence = 0, even when we set it's done (ie. doneSequence = 0), it is still
> not done since doneSequence == NOT_DONE == 0. Can we set NOT_DONE to -1, and
> the initial doneSequence to -2?
--
This message was sent by Atlassian JIRA
(v6.2#6252)