[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

Jeffrey Zhong (JIRA) Wed, 30 Apr 2014 22:32:24 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986360#comment-13986360
 ]


Jeffrey Zhong commented on HBASE-8763:
--------------------------------------

The reason using MutableLong object is that at the very beginning we don't know 
the real sync sequence number(due to the late binding) so I use MutableLong 
object which keeps a "faked" big sequence number. All new KVs and related of 
this transation reference this mvcc mutablelong object. Once after the 
corresponding WALEdit is synced(after SyncOrDefer call), we have the real 
sequence number and I reset the value of the MutaleLong in one place so all new 
KVs in MemStore will see the updated sequence number(because they keep the 
reference to this MVCC(MutableLong) instance.

If our WAL Sync doesn't late binding then I don't need to use MutableLong.

[~enis] is suggesting not to use MutableLong while keeping all new KVs and 
reset their MVCC values in an extra loop. This may be hard to implement because 
our pre & post co-processor copies MVCC values as in the code you pasted 
above(where I changed to copy reference)
{code}
newKv.setMvccVersion(kv.getMvccVersionReference());
{code}

My plan is to get all tests pass and then do enhancement/refactoring that you 
and [~enis] are suggesting.
 

> [BRAINSTORM] Combine MVCC and SeqId
> -----------------------------------
>
>                 Key: HBASE-8763
>                 URL: https://issues.apache.org/jira/browse/HBASE-8763
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Enis Soztutar
>            Assignee: Jeffrey Zhong
>            Priority: Critical
>         Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, 
> hbase-8763_wip1.patch
>
>
> HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
> seqId semantics. It seems that having mvcc and the seqId complicates the 
> comparator semantics a lot in regards to flush + WAL replay + compactions + 
> delete markers and out of order puts. 
> Thinking more about it I don't think we need a MVCC write number which is 
> different than the seqId. We can keep the MVCC semantics, read point and 
> smallest read points intact, but combine mvcc write number and seqId. This 
> will allow cleaner semantics + implementation + smaller data files. 
> We can do some brainstorming for 0.98. We still have to verify that this 
> would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

Reply via email to