attilapiros commented on pull request #34693:
URL: https://github.com/apache/spark/pull/34693#issuecomment-984829159


   > @attilapiros Finding in the query where the WITH piece stops, then where 
the SELECT begins is where I found the place to split.
   
   Thanks, I see that. But still it is really hard to automate it. So I think 
what Peter come up with is best we have right now.
   
   I was thinking about how to avoid `SELECT * FROM $table WHERE 1=0`. One of 
my idea was just replacing all the `SELECT` (ignoring case) with `SELECT 
top(0)` as that could be done even in string literals as it does not change the 
schema. But if `top` was already used somewhere then this ugly hack fails. And 
this is just one part of the problem to get the schema without running the 
query. The other one in (in `JDBCRDD.compute()`) is even harder to crack where 
the partitioning and pushed down group by is handled.
   
   So based on this LGTM.
   
   cc @viirya, @HyukjinKwon 
   
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to