[GitHub] [spark] peter-toth commented on pull request #41677: [WIP][SPARK-35564][SQL] Improve subexpression elimination

via GitHub Wed, 21 Jun 2023 10:21:05 -0700


peter-toth commented on PR #41677:
URL: https://github.com/apache/spark/pull/41677#issuecomment-1601278842


   > This hurts my brain thinking about probabilistic conditional evaluations, 
and I feel like the subexpression elimination logic is already overly 
complicated. If I wanted to just create subexpressions for anything that is 
definitely executed once and maybe executed one other time (regardless of how 
nested inside a CaseWhen or Coalesce operation), what do I even set the new 
setting to?
   
   Got it, thanks for your feedback. If `conditionalEvalCount` is considered is 
an overkill then I can revert it `conditionalUseCount` or a simple boolean 
flag. BTW, I think with this PR the `ExpressionStats` calculation logic becomes 
much simpler than it was before (especially if we revert to 
`conditionalEvalCount` or a boolean flag), the `getCommonSubexpressions()` 
method is what became a bit more complicated. 
   
   Currently the new config is used as `conditionalEvalCount >= <config value>` 
so you could use a very small value to behave the same way as 
`conditionalUseCount > 0`. Or we can change the config semantics to `>` and use 
0 config value for the same effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] peter-toth commented on pull request #41677: [WIP][SPARK-35564][SQL] Improve subexpression elimination

Reply via email to