[ https://issues.apache.org/jira/browse/HBASE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823205#comment-15823205 ]
Ted Yu commented on HBASE-17471: -------------------------------- I think we should maintain increasing sequence Id. About the test, it hangs here (with assertion triggered): {code} "main" #1 prio=5 os_prio=31 tid=0x00007fb95300c000 nid=0x1703 in Object.wait() [0x0000700000218000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000078a5885a8> (a org.apache.hadoop.hbase.regionserver.TestSeqiqMonotonicallyIncr$IncThread1) at java.lang.Thread.join(Thread.java:1245) - locked <0x000000078a5885a8> (a org.apache.hadoop.hbase.regionserver.TestSeqiqMonotonicallyIncr$IncThread1) at java.lang.Thread.join(Thread.java:1319) at org.apache.hadoop.hbase.regionserver.TestSeqiqMonotonicallyIncr.testIncPut1(TestSeqiqMonotonicallyIncr.java:165) {code} TestSeqiqMonotonicallyIncr -> TestMonotonicallyIncreasingSeqId {code} private static Path testDir = TEST_UTIL.getDataTestDir("TestStoreFileRefresherChore"); {code} Change the name to match the test > Region Seqid will be out of order in WAL if using mvccPreAssign > --------------------------------------------------------------- > > Key: HBASE-17471 > URL: https://issues.apache.org/jira/browse/HBASE-17471 > Project: HBase > Issue Type: Bug > Components: wal > Affects Versions: 2.0.0, 1.4.0 > Reporter: Allan Yang > Assignee: Allan Yang > Attachments: HBASE-17471.tmp > > > mvccPreAssign was bring by HBASE-16698, which truly improved the performance > of writing, especially in ASYNC_WAL scenario. But mvccPreAssign was only used > in {{doMiniBatchMutate}}, not in Increment/Append path. If Increment/Append > and batch put are using against the same region in parallel, then seqid of > the same region may not monotonically increasing in the WAL. Since one write > path acquires mvcc/seqid before append, and the other acquires in the > append/sync consume thread. > The out of order situation can easily reproduced by a simple UT, which was > attached in the attachment. I modified the code to assert on the disorder: > {code} > if(this.highestSequenceIds.containsKey(encodedRegionName)) { > assert highestSequenceIds.get(encodedRegionName) < sequenceid; > } > {code} > I'd like to say, If we are allow disorder in WALs, then this is not a issue. > But as far as I know, if {{highestSequenceIds}} is not properly set, some > WALs may not archive to oldWALs correctly. > which I haven't figure out yet is that, will disorder in WAL cause data loss > when recovering from disaster? If so, then it is a big problem need to be > fixed. > I have fix this problem in our costom1.1.x branch, my solution is using > mvccPreAssign everywhere, making it un-configurable. Since mvccPreAssign it > is indeed a better way than assign seqid in the ringbuffer thread while > keeping handlers waiting for it. > If anyone think it is doable, then I will port it to branch-1 and master > branch and upload it. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)