peter-toth commented on PR #41677: URL: https://github.com/apache/spark/pull/41677#issuecomment-1601278842
> This hurts my brain thinking about probabilistic conditional evaluations, and I feel like the subexpression elimination logic is already overly complicated. If I wanted to just create subexpressions for anything that is definitely executed once and maybe executed one other time (regardless of how nested inside a CaseWhen or Coalesce operation), what do I even set the new setting to? Got it, thanks for your feedback. If `conditionalEvalCount` is considered is an overkill then I can revert it `conditionalUseCount` or a simple boolean flag. BTW, I think with this PR the `ExpressionStats` calculation logic becomes much simpler than it was before (especially if we revert to `conditionalEvalCount` or a boolean flag), the `getCommonSubexpressions()` method is what became a bit more complicated. Currently the new config is used as `conditionalEvalCount >= <config value>` so you could use a very small value to behave the same way as `conditionalUseCount > 0`. Or we can change the config semantics to `>` and use 0 config value for the same effect. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
