[ https://issues.apache.org/jira/browse/HBASE-24440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116934#comment-17116934 ]
Andrew Kyle Purtell edited comment on HBASE-24440 at 5/26/20, 5:55 PM: ----------------------------------------------------------------------- Note this is a trick Apache Phoenix already uses to ensure uniqueness of timestamps for indexes. /cc [~gjacoby] was (Author: apurtell): Note this is a trick Apache Phoenix uses to ensure uniqueness of timestamps for indexes. > Prevent temporal misordering on timescales smaller than one clock tick > ---------------------------------------------------------------------- > > Key: HBASE-24440 > URL: https://issues.apache.org/jira/browse/HBASE-24440 > Project: HBase > Issue Type: Brainstorming > Reporter: Andrew Kyle Purtell > Priority: Major > > When mutations are sent to the servers without a timestamp explicitly > assigned by the client the server will substitute the current wall clock > time. There are edge cases where it is at least theoretically possible for > more than one mutation to be committed to a given row within the same clock > tick. When this happens we have to track and preserve the ordering of these > mutations in some other way besides the timestamp component of the key. Let > me bypass most discussion here by noting that whether we do this or not, we > do not pass such ordering information in the cross cluster replication > protocol. We also have interesting edge cases regarding key type precedence > when mutations arrive "simultaneously": we sort deletes ahead of puts. This, > especially in the presence of replication, can lead to visible anomalies for > clients able to interact with both source and sink. > There is a simple solution that removes the possibility that these edge cases > can occur: > We can detect, when we are about to commit a mutation to a row, if we have > already committed a mutation to this same row in the current clock tick. > Occurrences of this condition will be rare. We are already tracking current > time. We have to know this in order to assign the timestamp. Where this > becomes interesting is how we might track the last commit time per row. > Making the detection of this case efficient for the normal code path is the > bulk of the challenge. We would do this somehow via the memstore. Assuming we > can efficiently know if we are about to commit twice to the same row within a > single clock tick, we would simply sleep/yield the current thread until the > clock ticks over, and then proceed. -- This message was sent by Atlassian Jira (v8.3.4#803005)