Time travel reads in Kudu

Ananth G Sat, 17 Jun 2017 13:41:00 -0700

Hello All,

I was wondering if the following is possible as a time travel read in Kudu.


Assuming T stands for the timestamp at which the record has been committed, I 
have one insert for a given row @T1 followed by 3 updates at time stamps 
@T2,@T3 and @T4. Finally the row was deleted at @T5.  ( T1 < T2 < T3 < T4 < T5 
in terms of timestamps). Representing these values of this row as V, the 
following is the state of values of this row. 

T1 -> V1 ( original insert )
T2 -> V2 ( first update )
T3 -> V3 ( second update ) 
T4 -> V4 ( third update )
T5 -> V5 ( Tombstone/delete ) 

Now I want to perform a read scan. I am using the READ_AT_SNAPSHOT mode and 
using setSnapShotMicros()  method to perform the read at that snapshot. I was 
wondering if I would have the flexibility to get the following values provided 
I am using the snapshot times as follows : 

1. Can I get value V2 if I set snapshot time as t2 provided T2< t2 < T3 ? 
2. Can I get value V3 if I set snapshot time as t3 provided T3 < t3 <  T4 ? 

Also it is obvious for this to work properly  we will need two timestamps as 
part of the API call ( lower and upper bound ) to retrieve value V2.  The usage 
of the word MVCC is interesting and hence this question. 

In other words, when we say Kudu has a MVCC style for data as an asset; is it 
for all versions of the data mutation or just for the reconciliation stage ? I 
am assuming it is only for the last stage of reconciliation ( i.e. until reads 
are fully committed ). Since timestamps in Kudu seem to be for the lower bound 
markers, the above might not be possible but wanted to check with the 
community. 

If it is otherwise , does the model hold good after a compaction is performed ? 


Regards,
Ananth

Time travel reads in Kudu

Reply via email to