[ https://issues.apache.org/jira/browse/HBASE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833874#comment-15833874 ]
Yu Li commented on HBASE-17471: ------------------------------- bq. I'd like to say, separating them can truly resolve a lot of problems. Does this mean we need to fix more problems if stick on current mvcc-seqid-merge way? Please tell us more about the problems. I think we need to compare a) the advantage of merging mvcc and sequence-id and b) the efforts to fix all the problems introduced. Ping [~stack], for your special notice sir. [~allan163] works in [~zjushch]'s team and use HBase as a *common* service for Alibaba group, while my team using HBase as a *special* service (focusing on search and recommendation) for Alibaba search, so they're experiencing more common workloads/use-cases and may encounter more issues than us (such as in our scenario we don't have append/increment requests but they do) in product env. > Region Seqid will be out of order in WAL if using mvccPreAssign > --------------------------------------------------------------- > > Key: HBASE-17471 > URL: https://issues.apache.org/jira/browse/HBASE-17471 > Project: HBase > Issue Type: Bug > Components: wal > Affects Versions: 2.0.0, 1.4.0 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Critical > Attachments: HBASE-17471.patch, HBASE-17471.tmp, > HBASE-17471.v2.patch, HBASE-17471.v3.patch > > > mvccPreAssign was brought by HBASE-16698, which truly improved the > performance of writing, especially in ASYNC_WAL scenario. But mvccPreAssign > was only used in {{doMiniBatchMutate}}, not in Increment/Append path. If > Increment/Append and batch put are using against the same region in parallel, > then seqid of the same region may not monotonically increasing in the WAL. > Since one write path acquires mvcc/seqid before append, and the other > acquires in the append/sync consume thread. > The out of order situation can easily reproduced by a simple UT, which was > attached in the attachment. I modified the code to assert on the disorder: > {code} > if(this.highestSequenceIds.containsKey(encodedRegionName)) { > assert highestSequenceIds.get(encodedRegionName) < sequenceid; > } > {code} > I'd like to say, If we allow disorder in WALs, then this is not a issue. > But as far as I know, if {{highestSequenceIds}} is not properly set, some > WALs may not archive to oldWALs correctly. > which I haven't figure out yet is that, will disorder in WAL cause data loss > when recovering from disaster? If so, then it is a big problem need to be > fixed. > I have fix this problem in our costom1.1.x branch, my solution is using > mvccPreAssign everywhere, making it un-configurable. Since mvccPreAssign it > is indeed a better way than assign seqid in the ringbuffer thread while > keeping handlers waiting for it. > If anyone think it is doable, then I will port it to branch-1 and master > branch and upload it. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)