[
https://issues.apache.org/jira/browse/HBASE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823338#comment-15823338
]
Allan Yang commented on HBASE-17471:
------------------------------------
Thank you, Ted, have changed the test name and the path name. Sorry that, this
UT is not a 'real' UT, it will run forever just used to point out this issue
can happen. In the current system, disorder in WAL will not address a
exception, and it will not fail the writes. So I have to assert it to log this
scenario. A new UT against this situation will need more think.
> Region Seqid will be out of order in WAL if using mvccPreAssign
> ---------------------------------------------------------------
>
> Key: HBASE-17471
> URL: https://issues.apache.org/jira/browse/HBASE-17471
> Project: HBase
> Issue Type: Bug
> Components: wal
> Affects Versions: 2.0.0, 1.4.0
> Reporter: Allan Yang
> Assignee: Allan Yang
> Attachments: HBASE-17471.tmp
>
>
> mvccPreAssign was bring by HBASE-16698, which truly improved the performance
> of writing, especially in ASYNC_WAL scenario. But mvccPreAssign was only used
> in {{doMiniBatchMutate}}, not in Increment/Append path. If Increment/Append
> and batch put are using against the same region in parallel, then seqid of
> the same region may not monotonically increasing in the WAL. Since one write
> path acquires mvcc/seqid before append, and the other acquires in the
> append/sync consume thread.
> The out of order situation can easily reproduced by a simple UT, which was
> attached in the attachment. I modified the code to assert on the disorder:
> {code}
> if(this.highestSequenceIds.containsKey(encodedRegionName)) {
> assert highestSequenceIds.get(encodedRegionName) < sequenceid;
> }
> {code}
> I'd like to say, If we are allow disorder in WALs, then this is not a issue.
> But as far as I know, if {{highestSequenceIds}} is not properly set, some
> WALs may not archive to oldWALs correctly.
> which I haven't figure out yet is that, will disorder in WAL cause data loss
> when recovering from disaster? If so, then it is a big problem need to be
> fixed.
> I have fix this problem in our costom1.1.x branch, my solution is using
> mvccPreAssign everywhere, making it un-configurable. Since mvccPreAssign it
> is indeed a better way than assign seqid in the ringbuffer thread while
> keeping handlers waiting for it.
> If anyone think it is doable, then I will port it to branch-1 and master
> branch and upload it.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)