Thanks Andrew. However if I'm understanding the low level details in that ticket I don't think it's what I'm asking either, at least not entirely: KUDU-1702 is for exposing point-in-time reads to Impala DEVELOPERS, i'm hoping it can be exposed to Impala USERS via a SQL extension, so we can leverage it from user queries/reports. -m
On Tue, Nov 28, 2017 at 2:09 PM, Andrew Wong <aw...@cloudera.com> wrote: > Ah I see, looks like this is an open, tracked issue. See KUDU-1702 > <https://issues.apache.org/jira/browse/KUDU-1702?jql=project%20%3D%20KUDU%20AND%20resolution%20%3D%20Unresolved%20AND%20text%20~%20%22read_at_snapshot%22%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>, > although not sure what the status on it is. > > On Tue, Nov 28, 2017 at 12:17 PM, Mauricio Aristizabal < > mauri...@impactradius.com> wrote: > >> Thanks very much David and Andrew. Yes I'm aware this functionality is >> available via the java and C++ clients, but actually what I'm asking is if >> it could be made available via SQL/impala. Something like "select X from Y >> where snapshot_micros = 2343242423 <(234)%20324-2423>" (where >> snapshot_micros is a virtual column that would need a better name), or >> perhaps as part of the table name like "select X from Y@2343242423 >> <(234)%20324-2423>". -m >> >> On Tue, Nov 28, 2017 at 12:05 PM, David Alves <davidral...@gmail.com> >> wrote: >> >>> Hi Mauricio >>> >>> Andrew is right. That feature already exists in some form. With >>> READ_AT_SNAPSHOT you can provide a timestamp which will be the timepoint >>> under which all the scans are performed. >>> Note that, while generally supported and functionally tested, we >>> haven't focused a lot of resources into testing this, so your performance >>> mileage may vary. >>> In order to enable this for time points more than 5 mins in the past >>> you need to increase the "--tablet_history_max_age_sec" flag so that >>> the history won't get garbage collected. >>> >>> HTH >>> -david >>> >>> On Mon, Nov 27, 2017 at 9:42 PM, Andrew Wong <aw...@cloudera.com> wrote: >>> >>>> Hi Mauricio, >>>> >>>> If you haven't already, take a look at the READ_AT_SNAPSHOT read mode >>>> (more info here >>>> <https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans>). >>>> IIUC, it seems similar to, if not exactly what you're looking for! >>>> >>>> >>>> Andrew >>>> >>>> On Mon, Nov 27, 2017 at 5:02 PM, Mauricio Aristizabal < >>>> mauri...@impactradius.com> wrote: >>>> >>>>> Hi all, has there been any talk of supporting this any time soon? >>>>> >>>>> Time travel reads are such a cool feature, but even more than in ETL >>>>> jobs (via Java/Scala), they would be most useful via SQL to ensure >>>>> consistency when reading. >>>>> >>>>> Specifically, for example our spark streaming job updates dozens of >>>>> aggregation tables every 30 seconds. To make the data fully consistent we >>>>> would love to have views over these aggs tagged with the exact timestamp >>>>> we >>>>> want to expose. When each batch is done and all tables updated, we would >>>>> update all the views forward, effectively hiding the updates we're doing >>>>> until they're all ready. >>>>> >>>>> -m >>>>> >>>>> >>>>> >>>>> -- >>>>> *MAURICIO ARISTIZABAL* >>>>> Architect - Business Intelligence + Data Science >>>>> mauri...@impactradius.com(m)+1 323 309 4260 <(323)%20309-4260> >>>>> 223 E. De La Guerra St. | Santa Barbara, CA 93101 >>>>> <https://maps.google.com/?q=223+E.+De+La+Guerra+St.+%7C+Santa+Barbara,+CA+93101&entry=gmail&source=g> >>>>> >>>>> Overview <http://www.impactradius.com/?src=slsap> | Twitter >>>>> <https://twitter.com/impactradius> | Facebook >>>>> <https://www.facebook.com/pages/Impact-Radius/153376411365183> | >>>>> LinkedIn <https://www.linkedin.com/company/impact-radius-inc-> >>>>> >>>> >>>> >>>> >>>> -- >>>> Andrew Wong >>>> >>> >>> >> >> >> -- >> *MAURICIO ARISTIZABAL* >> Architect - Business Intelligence + Data Science >> mauri...@impactradius.com(m)+1 323 309 4260 <(323)%20309-4260> >> 223 E. De La Guerra St. | Santa Barbara, CA 93101 >> <https://maps.google.com/?q=223+E.+De+La+Guerra+St.+%7C+Santa+Barbara,+CA+93101&entry=gmail&source=g> >> >> Overview <http://www.impactradius.com/?src=slsap> | Twitter >> <https://twitter.com/impactradius> | Facebook >> <https://www.facebook.com/pages/Impact-Radius/153376411365183> | LinkedIn >> <https://www.linkedin.com/company/impact-radius-inc-> >> > > > > -- > Andrew Wong > -- *MAURICIO ARISTIZABAL* Architect - Business Intelligence + Data Science mauri...@impactradius.com(m)+1 323 309 4260 223 E. De La Guerra St. | Santa Barbara, CA 93101 Overview <http://www.impactradius.com/?src=slsap> | Twitter <https://twitter.com/impactradius> | Facebook <https://www.facebook.com/pages/Impact-Radius/153376411365183> | LinkedIn <https://www.linkedin.com/company/impact-radius-inc->