Side note about time travel: There is a PR <https://github.com/apache/spark/pull/34497> to add VERSION/TIMESTAMP AS OF syntax to Spark SQL.
On Mon, Nov 15, 2021 at 2:23 PM Ryan Blue <b...@tabular.io> wrote: > I want to note that I wouldn't recommend time traveling this way by using > the hint for `snapshot-id`. Instead, we want to add the standard SQL syntax > for that in a separate PR. This is useful for other options that help a > table scan perform better, like specifying the target split size. > > You're right that this isn't a typical optimizer hint, but I'm not sure > what other syntax is possible for this use case. How else would we send > custom properties through to the scan? > > On Mon, Nov 15, 2021 at 9:25 AM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> I am looking at the hint and it appears to me (I stand corrected), it is >> a single table hint as below: >> >> -- time travel >> SELECT * FROM t /*+ OPTIONS('snapshot-id'='10963874102873L') */ >> >> My assumption is that any view on this table will also benefit from this >> hint. This is not a hint to optimizer in a classical sense. Only a snapshot >> hint. Normally, a hint is an instruction to the optimizer. When writing >> SQL, one may know information about the data unknown to the optimizer. >> Hints enable one to make decisions normally made by the optimizer, >> sometimes causing the optimizer to select a plan that it sees as higher >> cost. >> >> >> So far as this case is concerned, it looks OK and I concur it should be >> extended to write as well. >> >> >> HTH >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Mon, 15 Nov 2021 at 17:02, Russell Spitzer <russell.spit...@gmail.com> >> wrote: >> >>> I think since we probably will end up using this same syntax on write, >>> this makes a lot of sense. Unless there is another good way to express a >>> similar concept during a write operation I think going forward with this >>> would be ok. >>> >>> On Mon, Nov 15, 2021 at 10:44 AM Ryan Blue <b...@tabular.io> wrote: >>> >>>> The proposed feature is to be able to pass options through SQL like you >>>> would when using the DataFrameReader API, so it would work for all >>>> sources that support read options. Read options are part of the DSv2 API, >>>> there just isn’t a way to pass options when using SQL. The PR also has a >>>> non-Iceberg example, which is being able to customize some JDBC source >>>> behaviors per query (e.g., fetchSize), rather than globally in the table’s >>>> options. >>>> >>>> The proposed syntax is odd, but I think that's an artifact of Spark >>>> introducing read options that aren't a normal part of SQL. Seems reasonable >>>> to me to pass them through a hint. >>>> >>>> On Mon, Nov 15, 2021 at 2:18 AM Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Interesting. >>>>> >>>>> What is this going to add on top of support for Apache Iceberg >>>>> <https://www.dremio.com/data-lake/apache-iceberg/>. Will it be in >>>>> line with support for Hive ACID tables or Delta Lake? >>>>> >>>>> HTH >>>>> >>>>> >>>>> >>>>> view my Linkedin profile >>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, 15 Nov 2021 at 01:56, Zhun Wang <wangzhun6...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi dev, >>>>>> >>>>>> We are discussing Support Dynamic Table Options for Spark SQL ( >>>>>> https://github.com/apache/spark/pull/34072). It is currently not >>>>>> sure if the syntax makes sense, and would like to know if there is other >>>>>> feedback or opinion on this. >>>>>> >>>>>> I would appreciate any feedback on this. >>>>>> >>>>>> Thanks. >>>>>> >>>>> >>>> >>>> -- >>>> Ryan Blue >>>> Tabular >>>> >>> > > -- > Ryan Blue > Tabular >