attilapiros commented on pull request #34693:
URL: https://github.com/apache/spark/pull/34693#issuecomment-984829159
> @attilapiros Finding in the query where the WITH piece stops, then where
the SELECT begins is where I found the place to split.
Thanks, I see that. But still it is really hard to automate it. So I think
what Peter come up with is best we have right now.
I was thinking about how to avoid `SELECT * FROM $table WHERE 1=0`. One of
my idea was just replacing all the `SELECT` (ignoring case) with `SELECT
top(0)` as that could be done even in string literals as it does not change the
schema. But if `top` was already used somewhere then this ugly hack fails. And
this is just one part of the problem to get the schema without running the
query. The other one in (in `JDBCRDD.compute()`) is even harder to crack where
the partitioning and pushed down group by is handled.
So based on this LGTM.
cc @viirya, @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]