sigmod commented on pull request #32298:
URL: https://github.com/apache/spark/pull/32298#issuecomment-1075524713


   > But I think that an extra shuffle could mean performance degradation
   > in case of scalar subqueries (CTEs returning only one row).
   
   Is it still way better than running the scalar subqueries over the same 
table multiple times?
   I'm more worried about the complexities (i.e., pattern matching cognitive 
overhead) with CommonSubqueries and CommonSubqueriesExec. E.g., iiuc, a logical 
rule to optimize scalar subqueries won't be able to traverse into the 
subqueries inside CommonSubqueries?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to