[
https://issues.apache.org/jira/browse/HBASE-24440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391685#comment-17391685
]
Andrew Kyle Purtell commented on HBASE-24440:
---------------------------------------------
I have HBASE-25975 working reasonably well in a test environment and unit tests
are passing. Will collect macro benchmark results and report back.
> Prevent temporal misordering on timescales smaller than one clock tick
> ----------------------------------------------------------------------
>
> Key: HBASE-24440
> URL: https://issues.apache.org/jira/browse/HBASE-24440
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Andrew Kyle Purtell
> Assignee: Andrew Kyle Purtell
> Priority: Major
>
> When mutations are sent to the servers without a timestamp explicitly
> assigned by the client the server will substitute the current wall clock
> time. There are edge cases where it is at least theoretically possible for
> more than one mutation to be committed to a given row within the same clock
> tick. When this happens we have to track and preserve the ordering of these
> mutations in some other way besides the timestamp component of the key. Let
> me bypass most discussion here by noting that whether we do this or not, we
> do not pass such ordering information in the cross cluster replication
> protocol. We also have interesting edge cases regarding key type precedence
> when mutations arrive "simultaneously": we sort deletes ahead of puts. This,
> especially in the presence of replication, can lead to visible anomalies for
> clients able to interact with both source and sink.
> There is a simple solution that removes the possibility that these edge cases
> can occur:
> We can detect, when we are about to commit a mutation to a row, if we have
> already committed a mutation to this same row in the current clock tick.
> Occurrences of this condition will be rare. We are already tracking current
> time. We have to know this in order to assign the timestamp. Where this
> becomes interesting is how we might track the last commit time per row.
> Making the detection of this case efficient for the normal code path is the
> bulk of the challenge. One option is to keep track of the last locked time
> for row locks. (Todo: How would we track and garbage collect this efficiently
> and correctly. Not the ideal option.) We might also do this tracking somehow
> via the memstore. (At least in this case the lifetime and distribution of in
> memory row state, including the proposed timestamps, would align.) Assuming
> we can efficiently know if we are about to commit twice to the same row
> within a single clock tick, we would simply sleep/yield the current thread
> until the clock ticks over, and then proceed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)