Re: Time travel reads in Kudu

Ananth Gundabattula Mon, 19 Jun 2017 03:50:28 -0700

Thanks a lot Todd. The gist clarifies the point. 

I think this is a really powerful feature of Kudu that is being undersold :). 
Some very interesting data analytics can be performed with these features using 
kudu store !! As an examplehttp://pachyderm.io/pfs.html 
<examplehttp://pachyderm.io/pfs.html>  kind of use cases are now enabled by 
kudu as well.


Thanks for clarification. Hoping to integrate this feature into Apache Apex 
read scanner mechanisms soon. 

Regards,
Ananth 
> On 19 Jun 2017, at 7:36 am, Todd Lipcon <[email protected]> wrote:
> 
> Just to illustrate, I wrote a quick python script that shows the behavior: 
> https://gist.github.com/toddlipcon/385fcf4211f83e4968be3401db3147ba 
> <https://gist.github.com/toddlipcon/385fcf4211f83e4968be3401db3147ba>
> 
> The script runs your scenario of insert, update, update, delete, and then 
> scans at each of the times between the operations. The output on my machine 
> (running against a local tserver) is:
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 594427): [(1, 'v1')]
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 595743): [(1, 'v2')]
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 597093): [(1, 'v3')]
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 598470): []
> 
> Note that this example script is relying on the local clock instead of the 
> propagated timestamps, so it might not work correctly against a cluster (the 
> server side may have clock skew relative to the local machine where the 
> script is running). If you need it to work including clock skew, you'll have 
> to use the more advanced APIs to retrieve propagated timestamps from the 
> server side after each write.
> 
> -Todd
> 
> 
> On Sun, Jun 18, 2017 at 1:36 PM, Todd Lipcon <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Ananth,
> 
> Answers inline below
> 
> On Sat, Jun 17, 2017 at 1:40 PM, Ananth G <[email protected] 
> <mailto:[email protected]>> wrote:
> Hello All,
> 
> I was wondering if the following is possible as a time travel read in Kudu.
> 
> Assuming T stands for the timestamp at which the record has been committed, I 
> have one insert for a given row @T1 followed by 3 updates at time stamps 
> @T2,@T3 and @T4. Finally the row was deleted at @T5.  ( T1 < T2 < T3 < T4 < 
> T5 in terms of timestamps). Representing these values of this row as V, the 
> following is the state of values of this row.
> 
> T1 -> V1 ( original insert )
> T2 -> V2 ( first update )
> T3 -> V3 ( second update )
> T4 -> V4 ( third update )
> T5 -> V5 ( Tombstone/delete )
> 
> Now I want to perform a read scan. I am using the READ_AT_SNAPSHOT mode and 
> using setSnapShotMicros()  method to perform the read at that snapshot. I was 
> wondering if I would have the flexibility to get the following values 
> provided I am using the snapshot times as follows :
> 
> 1. Can I get value V2 if I set snapshot time as t2 provided T2< t2 < T3 ?
> yes
>  
> 2. Can I get value V3 if I set snapshot time as t3 provided T3 < t3 <  T4 ?
> 
> yes
>  
> Also it is obvious for this to work properly  we will need two timestamps as 
> part of the API call ( lower and upper bound ) to retrieve value V2.  The 
> usage of the word MVCC is interesting and hence this question.
> 
> I'm not following what you mean by a lower and upper bound timestamp? The 
> READ_AT_SNAPSHOT setting means that you read the state of the table exactly 
> as it was at the provided time. So, if you provide a time in between T2 and 
> T3, you will see the value that was most recently committed before the 
> specified time (i.e the value at T2)
> 
> 
>  
> 
> In other words, when we say Kudu has a MVCC style for data as an asset; is it 
> for all versions of the data mutation or just for the reconciliation stage ? 
> I am assuming it is only for the last stage of reconciliation ( i.e. until 
> reads are fully committed ). Since timestamps in Kudu seem to be for the 
> lower bound markers, the above might not be possible but wanted to check with 
> the community.
> 
> It stores all history for a configurable amount of time 
> (--tablet-history-max-age-sec, default 15 minutes). You can bump this to a 
> longer amount of time.
>  
> 
> If it is otherwise , does the model hold good after a compaction is performed 
> ?
> 
> 
> Yes, as of version 1.2 (I think) the full history is properly retained 
> regardless of any compactions, etc, subject to the above mentioned history 
> limit.
> 
> -Todd
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: Time travel reads in Kudu

Reply via email to