rdtr commented on PR #56180: URL: https://github.com/apache/spark/pull/56180#issuecomment-4577990492
@acruise Good question! To clarify, the intent here isn't to re-implement `ConstantFolding`. The thing I wanted to highlight is that `LimitPushDown` is itself the rule that synthesizes the `Add(le, oe)` node. It ends up emitting a `LocalLimit(Add(Literal, Literal), _)` that `BasicOperators` can't plan on its own, so it's effectively relying on a separate downstream rule to fold the expression it just created. That coupling isn't declared anywhere, so the plan it produces is only valid if `ConstantFolding` happens to run afterward. The fold is scoped narrowly to exactly the node this rule creates: when both operands are integer literals (the realistic case, since they come from `LIMIT N OFFSET M`), it emits the merged `Literal` directly instead of an `Add` that something else has to simplify later. Non-literal operands still fall through to `Add(le, oe)` unchanged, so behavior elsewhere is untouched. I did look at the alternative of teaching `BasicOperators` (the physical strategy) to evaluate a foldable limit expression instead. That seemed a bit less clean to me: it moves the same arithmetic into physical planning and spreads the limit-value handling across the logical/physical boundary, while the logical plan still carries a node that reads as unplannable. Folding at the point of creation felt more localized, but I'm happy to go that route instead if you'd prefer it. On "why would `ConstantFolding` ever be disabled" -- it's a public, supported lever (`spark.sql.optimizer.excludedRules`), and some custom optimizer pipelines / plugins exclude rules to force particular code paths in testing. That said, I think the more general motivation is just that it'd be nice for a rewrite rule to be able to finish a node it could trivially fold itself, rather than leaving it unplannable until another rule runs. The Gluten ticket is just where this happened to surface. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
