Mich, time travel will use the newly added VERSION AS OF or TIMESTAMP AS OF syntax.
On Tue, Nov 16, 2021 at 12:40 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > As I stated before, hints are designed to direct the optimizer to choose > a certain query execution plan based on the specific criteria. > > > -- time travel > SELECT * FROM t /*+ OPTIONS('snapshot-id'='10963874102873L') */ > > > The alternative would be to specify time travel by creating a snapshot > based on CURRENT_DATE() range which encapsulates time travel for > 'snapshot-id'='10963874102873L' > > > CREATE SNAPSHOT t_snap > > START WITH CURRENT_DATE() - 30 > > NEXT CURRENT_DATE() > > AS SELECT * FROM t > > > SELECT * FROM t_snap > > > HTH > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Tue, 16 Nov 2021 at 04:26, Hyukjin Kwon <gurwls...@gmail.com> wrote: > >> My biggest concern with the syntax in hints is that Spark SQL's options >> can change results (e.g., CSV's header options) whereas hints are generally >> not designed to affect the external results if I am not mistaken. This is >> counterintuitive. >> I left the comment in the PR but what's the real benefit over leveraging: >> SET conf and RESET conf? we can extract options from runtime session >> configurations e.g., SessionConfigSupport. >> >> On Tue, 16 Nov 2021 at 04:30, Nicholas Chammas < >> nicholas.cham...@gmail.com> wrote: >> >>> Side note about time travel: There is a PR >>> <https://github.com/apache/spark/pull/34497> to add VERSION/TIMESTAMP >>> AS OF syntax to Spark SQL. >>> >>> On Mon, Nov 15, 2021 at 2:23 PM Ryan Blue <b...@tabular.io> wrote: >>> >>>> I want to note that I wouldn't recommend time traveling this way by >>>> using the hint for `snapshot-id`. Instead, we want to add the standard SQL >>>> syntax for that in a separate PR. This is useful for other options that >>>> help a table scan perform better, like specifying the target split size. >>>> >>>> You're right that this isn't a typical optimizer hint, but I'm not sure >>>> what other syntax is possible for this use case. How else would we send >>>> custom properties through to the scan? >>>> >>>> On Mon, Nov 15, 2021 at 9:25 AM Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> I am looking at the hint and it appears to me (I stand corrected), it >>>>> is a single table hint as below: >>>>> >>>>> -- time travel >>>>> SELECT * FROM t /*+ OPTIONS('snapshot-id'='10963874102873L') */ >>>>> >>>>> My assumption is that any view on this table will also benefit from >>>>> this hint. This is not a hint to optimizer in a classical sense. Only a >>>>> snapshot hint. Normally, a hint is an instruction to the optimizer. >>>>> When writing SQL, one may know information about the data unknown to the >>>>> optimizer. Hints enable one to make decisions normally made by the >>>>> optimizer, sometimes causing the optimizer to select a plan that it sees >>>>> as >>>>> higher cost. >>>>> >>>>> >>>>> So far as this case is concerned, it looks OK and I concur it should >>>>> be extended to write as well. >>>>> >>>>> >>>>> HTH >>>>> >>>>> >>>>> view my Linkedin profile >>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, 15 Nov 2021 at 17:02, Russell Spitzer < >>>>> russell.spit...@gmail.com> wrote: >>>>> >>>>>> I think since we probably will end up using this same syntax on >>>>>> write, this makes a lot of sense. Unless there is another good way to >>>>>> express a similar concept during a write operation I think going forward >>>>>> with this would be ok. >>>>>> >>>>>> On Mon, Nov 15, 2021 at 10:44 AM Ryan Blue <b...@tabular.io> wrote: >>>>>> >>>>>>> The proposed feature is to be able to pass options through SQL like >>>>>>> you would when using the DataFrameReader API, so it would work for >>>>>>> all sources that support read options. Read options are part of the DSv2 >>>>>>> API, there just isn’t a way to pass options when using SQL. The PR also >>>>>>> has >>>>>>> a non-Iceberg example, which is being able to customize some JDBC source >>>>>>> behaviors per query (e.g., fetchSize), rather than globally in the >>>>>>> table’s >>>>>>> options. >>>>>>> >>>>>>> The proposed syntax is odd, but I think that's an artifact of Spark >>>>>>> introducing read options that aren't a normal part of SQL. Seems >>>>>>> reasonable >>>>>>> to me to pass them through a hint. >>>>>>> >>>>>>> On Mon, Nov 15, 2021 at 2:18 AM Mich Talebzadeh < >>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>> >>>>>>>> Interesting. >>>>>>>> >>>>>>>> What is this going to add on top of support for Apache Iceberg >>>>>>>> <https://www.dremio.com/data-lake/apache-iceberg/>. Will it be in >>>>>>>> line with support for Hive ACID tables or Delta Lake? >>>>>>>> >>>>>>>> HTH >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> view my Linkedin profile >>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>>> for any loss, damage or destruction of data or any other property >>>>>>>> which may >>>>>>>> arise from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>> damages >>>>>>>> arising from such loss, damage or destruction. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, 15 Nov 2021 at 01:56, Zhun Wang <wangzhun6...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi dev, >>>>>>>>> >>>>>>>>> We are discussing Support Dynamic Table Options for Spark SQL ( >>>>>>>>> https://github.com/apache/spark/pull/34072). It is currently not >>>>>>>>> sure if the syntax makes sense, and would like to know if there is >>>>>>>>> other >>>>>>>>> feedback or opinion on this. >>>>>>>>> >>>>>>>>> I would appreciate any feedback on this. >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ryan Blue >>>>>>> Tabular >>>>>>> >>>>>> >>>> >>>> -- >>>> Ryan Blue >>>> Tabular >>>> >>> -- Ryan Blue Tabular