rdtr commented on PR #56180:
URL: https://github.com/apache/spark/pull/56180#issuecomment-4577990492

   @acruise Good question! To clarify, the intent here isn't to re-implement 
`ConstantFolding`. The thing I wanted to highlight is that `LimitPushDown` is 
itself the rule that synthesizes the `Add(le, oe)` node. It ends up emitting a 
`LocalLimit(Add(Literal, Literal), _)` that `BasicOperators` can't plan on its 
own, so it's effectively relying on a separate downstream rule to fold the 
expression it just created. That coupling isn't declared anywhere, so the plan 
it produces is only valid if `ConstantFolding` happens to run afterward.
   
   The fold is scoped narrowly to exactly the node this rule creates: when both 
operands are integer literals (the realistic case, since they come from `LIMIT 
N OFFSET M`), it emits the merged `Literal` directly instead of an `Add` that 
something else has to simplify later. Non-literal operands still fall through 
to `Add(le, oe)` unchanged, so behavior elsewhere is untouched.
   
   I did look at the alternative of teaching `BasicOperators` (the physical 
strategy) to evaluate a foldable limit expression instead. That seemed a bit 
less clean to me: it moves the same arithmetic into physical planning and 
spreads the limit-value handling across the logical/physical boundary, while 
the logical plan still carries a node that reads as unplannable. Folding at the 
point of creation felt more localized, but I'm happy to go that route instead 
if you'd prefer it.
   
   On "why would `ConstantFolding` ever be disabled" -- it's a public, 
supported lever (`spark.sql.optimizer.excludedRules`), and some custom 
optimizer pipelines / plugins exclude rules to force particular code paths in 
testing. That said, I think the more general motivation is just that it'd be 
nice for a rewrite rule to be able to finish a node it could trivially fold 
itself, rather than leaving it unplannable until another rule runs. The Gluten 
ticket is just where this happened to surface.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to