[jira] [Comment Edited] (HBASE-24440) Prevent temporal misordering on timescales smaller than one clock tick

Andrew Kyle Purtell (Jira) Mon, 07 Jun 2021 17:48:04 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-24440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358964#comment-17358964
 ]


Andrew Kyle Purtell edited comment on HBASE-24440 at 6/8/21, 12:47 AM:
-----------------------------------------------------------------------

{quote}
IIUC the idea here is to maximize the commit throughput as much as possible in 
a single tick by picking from the available pool of disjoint row keys that can 
share the same tick. Let me take a look at the patch..
{quote}
This is the HBASE-25975 subtask. I still need to add a test that proves it even 
works. :-) You might want to come back to that subtask in a bit, when I post an 
update there. 

The basic idea is this: 
- First, check if the clock has advanced, atomically. If it has, then clear the 
row set, also atomically, with respect to the clock advance check. 
- Get a reference to the row set for the current tick. Run through all of the 
rows involved in the pending mutation. The check adds each row to the set, but 
allows the pending mutation as long as each row did not exist in the set when 
it was added.
- If the pending mutation is going to be allowed to go forward, return the time 
where we did the atomic clock advance check.
- Otherwise, yield the thread, and try again, which will keep the handler in 
queue until the row set is finally disjoint from any other (per the check logic 
described above)


was (Author: apurtell):
{quote}
IIUC the idea here is to maximize the commit throughput as much as possible in 
a single tick by picking from the available pool of disjoint row keys that can 
share the same tick. Let me take a look at the patch..
{quote}
This is the HBASE-25975 subtask. I still need to add a test that proves it even 
works. :-) You might want to come back to that subtask in a bit, when I post an 
update there. 

The basic idea is this: 
- First, check if the clock has advanced, atomically. If it has, then clear the 
row set, also atomically, with respect to the clock advance check. 
- Run through all of the rows involved in the pending mutation. The check adds 
each row to the set, but allows the pending mutation as long as each row did 
not exist in the set when it was added.
- If the pending mutation is going to be allowed to go forward, return the time 
where we did the atomic clock advance check.
- Otherwise, yield the thread, and try again, which will keep the handler in 
queue until the row set is finally disjoint from any other (per the check logic 
described above)

> Prevent temporal misordering on timescales smaller than one clock tick
> ----------------------------------------------------------------------
>
>                 Key: HBASE-24440
>                 URL: https://issues.apache.org/jira/browse/HBASE-24440
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Andrew Kyle Purtell
>            Assignee: Andrew Kyle Purtell
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> When mutations are sent to the servers without a timestamp explicitly 
> assigned by the client the server will substitute the current wall clock 
> time. There are edge cases where it is at least theoretically possible for 
> more than one mutation to be committed to a given row within the same clock 
> tick. When this happens we have to track and preserve the ordering of these 
> mutations in some other way besides the timestamp component of the key. Let 
> me bypass most discussion here by noting that whether we do this or not, we 
> do not pass such ordering information in the cross cluster replication 
> protocol. We also have interesting edge cases regarding key type precedence 
> when mutations arrive "simultaneously": we sort deletes ahead of puts. This, 
> especially in the presence of replication, can lead to visible anomalies for 
> clients able to interact with both source and sink. 
> There is a simple solution that removes the possibility that these edge cases 
> can occur: 
> We can detect, when we are about to commit a mutation to a row, if we have 
> already committed a mutation to this same row in the current clock tick. 
> Occurrences of this condition will be rare. We are already tracking current 
> time. We have to know this in order to assign the timestamp. Where this 
> becomes interesting is how we might track the last commit time per row. 
> Making the detection of this case efficient for the normal code path is the 
> bulk of the challenge. One option is to keep track of the last locked time 
> for row locks. (Todo: How would we track and garbage collect this efficiently 
> and correctly. Not the ideal option.) We might also do this tracking somehow 
> via the memstore. (At least in this case the lifetime and distribution of in 
> memory row state, including the proposed timestamps, would align.) Assuming 
> we can efficiently know if we are about to commit twice to the same row 
> within a single clock tick, we would simply sleep/yield the current thread 
> until the clock ticks over, and then proceed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HBASE-24440) Prevent temporal misordering on timescales smaller than one clock tick

Reply via email to