andygrove commented on issue #4500:
URL: 
https://github.com/apache/datafusion-comet/issues/4500#issuecomment-4835546666

   **Item #2 (UnaryPositive serde on Spark 3.4 / 3.5) — investigated, no change 
needed.**
   
   The premise that `+col` reaches serde on Spark 3.4 / 3.5 does not hold. The 
default optimizer batch includes `RemoveDispensableExpressions`, which 
unconditionally rewrites `UnaryPositive(child) => child` 
(`sql/catalyst/.../optimizer/expressions.scala`), so `UnaryPositive` is 
stripped before Comet ever sees the plan. This mirrors the Spark 4.0+ path 
where it is `RuntimeReplaceable` and removed by the optimizer.
   
   Verified empirically: `SELECT +_1, +_2, +_3, +_4 FROM tbl` runs fully 
natively on Spark 3.5 with no code change, even with 
`spark.comet.exec.scalaUDF.codegen.enabled=false`. The enclosing projection 
stays native because the `+` is already gone by serde time.
   
   Consequently the `CometUnaryPositive` codegen-dispatch registration added in 
#4538 is effectively dead code in normal queries: it is only reachable if a 
user disables `RemoveDispensableExpressions` via 
`spark.sql.optimizer.excludedRules`, and even then only when the codegen 
dispatcher is enabled. No correctness gap exists, so no serde change is 
required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to