andygrove commented on issue #4500: URL: https://github.com/apache/datafusion-comet/issues/4500#issuecomment-4835546666
**Item #2 (UnaryPositive serde on Spark 3.4 / 3.5) — investigated, no change needed.** The premise that `+col` reaches serde on Spark 3.4 / 3.5 does not hold. The default optimizer batch includes `RemoveDispensableExpressions`, which unconditionally rewrites `UnaryPositive(child) => child` (`sql/catalyst/.../optimizer/expressions.scala`), so `UnaryPositive` is stripped before Comet ever sees the plan. This mirrors the Spark 4.0+ path where it is `RuntimeReplaceable` and removed by the optimizer. Verified empirically: `SELECT +_1, +_2, +_3, +_4 FROM tbl` runs fully natively on Spark 3.5 with no code change, even with `spark.comet.exec.scalaUDF.codegen.enabled=false`. The enclosing projection stays native because the `+` is already gone by serde time. Consequently the `CometUnaryPositive` codegen-dispatch registration added in #4538 is effectively dead code in normal queries: it is only reachable if a user disables `RemoveDispensableExpressions` via `spark.sql.optimizer.excludedRules`, and even then only when the codegen dispatcher is enabled. No correctness gap exists, so no serde change is required. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
