[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837109#comment-13837109
]
Jeffrey Zhong commented on HBASE-8763:
--------------------------------------
Today I had some discussion with [~enis] and [[email protected]] on this topic
and found it might be possible to handle the JIRA issue in a simpler way. Below
are the steps:
1) Memstore insert using long.max as the initial write number
2) append no sync
3) sync
4) update WriteEntry's write number to the sequence number returned from Step 2
5) CompleteMemstoreInsert. In this step, make current read point to be >= the
sequence number from Step 2. The reasoning behind this is that once we sync
till the sequence number, all changes with small sequence numbers are already
synced into WAL. Therefore, we should be able to bump up read number to the
last sequence number synced.
Currently, we maintain an internal queue which might defer the read point bump
up if transactions complete order is different than that of MVCC internal write
queue.
By doing above, it's possible to remove the logics maintaining writeQueue so it
means we can remove two locking and one queue loop in write code path. Sounds
too good to be true :-). Let me try to write a quick patch and run it against
unit tests to see if the idea could fly.
> [BRAINSTORM] Combine MVCC and SeqId
> -----------------------------------
>
> Key: HBASE-8763
> URL: https://issues.apache.org/jira/browse/HBASE-8763
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Enis Soztutar
> Attachments: hbase-8763_wip1.patch
>
>
> HBASE-8701 and a lot of recent issues include good discussions about mvcc +
> seqId semantics. It seems that having mvcc and the seqId complicates the
> comparator semantics a lot in regards to flush + WAL replay + compactions +
> delete markers and out of order puts.
> Thinking more about it I don't think we need a MVCC write number which is
> different than the seqId. We can keep the MVCC semantics, read point and
> smallest read points intact, but combine mvcc write number and seqId. This
> will allow cleaner semantics + implementation + smaller data files.
> We can do some brainstorming for 0.98. We still have to verify that this
> would be semantically correct, it should be so by my current understanding.
--
This message was sent by Atlassian JIRA
(v6.1#6144)