[jira] [Comment Edited] (HBASE-24440) Prevent temporal misordering on timescales smaller than one clock tick

Andrew Kyle Purtell (Jira) Mon, 01 Jun 2020 13:13:09 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-24440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121293#comment-17121293
 ]


Andrew Kyle Purtell edited comment on HBASE-24440 at 6/1/20, 8:12 PM:
----------------------------------------------------------------------

I am aware. If we do this I don’t think we will need it at all, configurable or 
not. But that is out of scope for this issue.

Edit: Some might respond, validly, that this is splitting hairs, because one 
follows the other: If we will never have two exact keys including timestamps 
ever committed to a row, then we don't need a sorting rule by operator 
precedence for a case that, after this proposed change, can never happen. I am 
proposing we do it in steps, with small reversible changes, because this is 
such a critical area for correctness, but if the consensus is to do it 
together, I would not oppose that for what it's worth.


was (Author: apurtell):
I am aware. If we do this I don’t think we will need it at all, configurable or 
not. But that is out of scope for this issue.

Edit: Some might respond, validly, that this is splitting hairs, because one 
follows the other: If we will never have two exact keys including timestamps 
ever committed to a row, then we don't need a sorting rule by operator 
precedence. I am proposing we do it in steps, with small reversible changes, 
because this is such a critical area for correctness, but if the consensus is 
to do it together, I would not oppose that for what it's worth.

> Prevent temporal misordering on timescales smaller than one clock tick
> ----------------------------------------------------------------------
>
>                 Key: HBASE-24440
>                 URL: https://issues.apache.org/jira/browse/HBASE-24440
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>
> When mutations are sent to the servers without a timestamp explicitly 
> assigned by the client the server will substitute the current wall clock 
> time. There are edge cases where it is at least theoretically possible for 
> more than one mutation to be committed to a given row within the same clock 
> tick. When this happens we have to track and preserve the ordering of these 
> mutations in some other way besides the timestamp component of the key. Let 
> me bypass most discussion here by noting that whether we do this or not, we 
> do not pass such ordering information in the cross cluster replication 
> protocol. We also have interesting edge cases regarding key type precedence 
> when mutations arrive "simultaneously": we sort deletes ahead of puts. This, 
> especially in the presence of replication, can lead to visible anomalies for 
> clients able to interact with both source and sink. 
> There is a simple solution that removes the possibility that these edge cases 
> can occur: 
> We can detect, when we are about to commit a mutation to a row, if we have 
> already committed a mutation to this same row in the current clock tick. 
> Occurrences of this condition will be rare. We are already tracking current 
> time. We have to know this in order to assign the timestamp. Where this 
> becomes interesting is how we might track the last commit time per row. 
> Making the detection of this case efficient for the normal code path is the 
> bulk of the challenge. One option is to keep track of the last locked time 
> for row locks. (Todo: How would we track and garbage collect this efficiently 
> and correctly. Not the ideal option.) We might also do this tracking somehow 
> via the memstore. (At least in this case the lifetime and distribution of in 
> memory row state, including the proposed timestamps, would align.) Assuming 
> we can efficiently know if we are about to commit twice to the same row 
> within a single clock tick, we would simply sleep/yield the current thread 
> until the clock ticks over, and then proceed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HBASE-24440) Prevent temporal misordering on timescales smaller than one clock tick

Reply via email to