Thanks very much David and Andrew. Yes I'm aware this functionality is available via the java and C++ clients, but actually what I'm asking is if it could be made available via SQL/impala. Something like "select X from Y where snapshot_micros = 2343242423" (where snapshot_micros is a virtual column that would need a better name), or perhaps as part of the table name like "select X from Y@2343242423". -m
On Tue, Nov 28, 2017 at 12:05 PM, David Alves <[email protected]> wrote: > Hi Mauricio > > Andrew is right. That feature already exists in some form. With > READ_AT_SNAPSHOT you can provide a timestamp which will be the timepoint > under which all the scans are performed. > Note that, while generally supported and functionally tested, we haven't > focused a lot of resources into testing this, so your performance mileage > may vary. > In order to enable this for time points more than 5 mins in the past you > need to increase the "--tablet_history_max_age_sec" flag so that the > history won't get garbage collected. > > HTH > -david > > On Mon, Nov 27, 2017 at 9:42 PM, Andrew Wong <[email protected]> wrote: > >> Hi Mauricio, >> >> If you haven't already, take a look at the READ_AT_SNAPSHOT read mode >> (more info here >> <https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans>). >> IIUC, it seems similar to, if not exactly what you're looking for! >> >> >> Andrew >> >> On Mon, Nov 27, 2017 at 5:02 PM, Mauricio Aristizabal < >> [email protected]> wrote: >> >>> Hi all, has there been any talk of supporting this any time soon? >>> >>> Time travel reads are such a cool feature, but even more than in ETL >>> jobs (via Java/Scala), they would be most useful via SQL to ensure >>> consistency when reading. >>> >>> Specifically, for example our spark streaming job updates dozens of >>> aggregation tables every 30 seconds. To make the data fully consistent we >>> would love to have views over these aggs tagged with the exact timestamp we >>> want to expose. When each batch is done and all tables updated, we would >>> update all the views forward, effectively hiding the updates we're doing >>> until they're all ready. >>> >>> -m >>> >>> >>> >>> -- >>> *MAURICIO ARISTIZABAL* >>> Architect - Business Intelligence + Data Science >>> [email protected](m)+1 323 309 4260 <(323)%20309-4260> >>> 223 E. De La Guerra St. | Santa Barbara, CA 93101 >>> <https://maps.google.com/?q=223+E.+De+La+Guerra+St.+%7C+Santa+Barbara,+CA+93101&entry=gmail&source=g> >>> >>> Overview <http://www.impactradius.com/?src=slsap> | Twitter >>> <https://twitter.com/impactradius> | Facebook >>> <https://www.facebook.com/pages/Impact-Radius/153376411365183> | >>> LinkedIn <https://www.linkedin.com/company/impact-radius-inc-> >>> >> >> >> >> -- >> Andrew Wong >> > > -- *MAURICIO ARISTIZABAL* Architect - Business Intelligence + Data Science [email protected](m)+1 323 309 4260 223 E. De La Guerra St. | Santa Barbara, CA 93101 Overview <http://www.impactradius.com/?src=slsap> | Twitter <https://twitter.com/impactradius> | Facebook <https://www.facebook.com/pages/Impact-Radius/153376411365183> | LinkedIn <https://www.linkedin.com/company/impact-radius-inc->
