Re: Supports Dynamic Table Options for Spark SQL

2021-11-18 Thread Mich Talebzadeh
OK on this let us dig a bit deeper focusing on time travel queries (TTQ). The interest is to return data as it appeared at a specific time. So the discussion is now on how to enable this. We can specify this by using a placeholder such as 'AS OF SYSTEM TIME' after the table name in a FROM SEL

Re: Supports Dynamic Table Options for Spark SQL

2021-11-16 Thread Wenchen Fan
It's useful to have a SQL API to specify table options, similar to the DataFrameReader API. However, I share the same concern from @Hyukjin Kwon and am not very comfortable with using hints to do it. In the PR, someone mentioned TVF. I think it's better than hints, but still has problems. For exa

Re: Supports Dynamic Table Options for Spark SQL

2021-11-16 Thread Mich Talebzadeh
This concept is explained here somehow. If this is true why cannot we just use SELECT * FROM VERSION AS OF view my Linkedin profile

Re: Supports Dynamic Table Options for Spark SQL

2021-11-16 Thread Ryan Blue
Mich, time travel will use the newly added VERSION AS OF or TIMESTAMP AS OF syntax. On Tue, Nov 16, 2021 at 12:40 AM Mich Talebzadeh wrote: > As I stated before, hints are designed to direct the optimizer to choose > a certain query execution plan based on the specific criteria. > > > -- time tr

Re: Supports Dynamic Table Options for Spark SQL

2021-11-16 Thread Mich Talebzadeh
As I stated before, hints are designed to direct the optimizer to choose a certain query execution plan based on the specific criteria. -- time travel SELECT * FROM t /*+ OPTIONS('snapshot-id'='10963874102873L') */ The alternative would be to specify time travel by creating a snapshot based on

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Hyukjin Kwon
My biggest concern with the syntax in hints is that Spark SQL's options can change results (e.g., CSV's header options) whereas hints are generally not designed to affect the external results if I am not mistaken. This is counterintuitive. I left the comment in the PR but what's the real benefit ov

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Nicholas Chammas
Side note about time travel: There is a PR to add VERSION/TIMESTAMP AS OF syntax to Spark SQL. On Mon, Nov 15, 2021 at 2:23 PM Ryan Blue wrote: > I want to note that I wouldn't recommend time traveling this way by using > the hint for `snapshot-id`. I

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Ryan Blue
I want to note that I wouldn't recommend time traveling this way by using the hint for `snapshot-id`. Instead, we want to add the standard SQL syntax for that in a separate PR. This is useful for other options that help a table scan perform better, like specifying the target split size. You're rig

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Mich Talebzadeh
I am looking at the hint and it appears to me (I stand corrected), it is a single table hint as below: -- time travel SELECT * FROM t /*+ OPTIONS('snapshot-id'='10963874102873L') */ My assumption is that any view on this table will also benefit from this hint. This is not a hint to optimizer in a

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Russell Spitzer
I think since we probably will end up using this same syntax on write, this makes a lot of sense. Unless there is another good way to express a similar concept during a write operation I think going forward with this would be ok. On Mon, Nov 15, 2021 at 10:44 AM Ryan Blue wrote: > The proposed f

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Ryan Blue
The proposed feature is to be able to pass options through SQL like you would when using the DataFrameReader API, so it would work for all sources that support read options. Read options are part of the DSv2 API, there just isn’t a way to pass options when using SQL. The PR also has a non-Iceberg e

Re: Supports Dynamic Table Options for Spark SQL

2021-11-15 Thread Mich Talebzadeh
Interesting. What is this going to add on top of support for Apache Iceberg . Will it be in line with support for Hive ACID tables or Delta Lake? HTH view my Linkedin profile *Dis

Supports Dynamic Table Options for Spark SQL

2021-11-14 Thread Zhun Wang
Hi dev, We are discussing Support Dynamic Table Options for Spark SQL ( https://github.com/apache/spark/pull/34072). It is currently not sure if the syntax makes sense, and would like to know if there is other feedback or opinion on this. I would appreciate any feedback on this. Thanks.