sigmod commented on pull request #32298:
URL: https://github.com/apache/spark/pull/32298#issuecomment-1076603246


   Thanks, @peter-toth!
   
   > I have a follow-up PR to support merging different filter predicates with 
OR, 
   > I just didn't want to make this PR more complex
   > This PR wants to deal with that scope only. 
   
   Definitely. I was just saying `CTERelationRef` can be used in more general 
cases beyond non-deterministic CTE definitions. Let's not expand the scope of 
this PR.
   
   > @sigmod, how about doing this kind of transformation?
   
   It looks good to me. IIUC, you want to wrap columns with a struct so that 
you can execute it as a scalar subquery?
   
   > and adding a flag to cte CTERelationDef that it hosts a scalar query
   
   Sounds good to me.  Will you add a "optimization" rule to add such an 
"annotation" by looking at the plan holistically, e.g., all consumers of a CTE 
are simply to pull out a field value?
   
   I'm thinking of the following scenario for future improvements:
   - a non-subquery plan subtree can share the plan structure with scalar 
subqueries too
   - in this case, the CTE is reused by both subqueries and ordinary plan 
subtrees
   
   We might also want to make sure MergeSubqueries do not prevent such reuse 
opportunities down the road.
   
   > + changing WithCTEStrategy a bit to avoid extra shuffles in those cases as 
   > ReuseExchangeAndSubquery can insert ReusedSubqueryExec nodes (no need to 
insert ReusedExchangeExec).
   
   Will you rewrite the physical plan to change the consumer subqueries to 
GetStructField?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to