andygrove opened a new issue, #4485:
URL: https://github.com/apache/datafusion-comet/issues/4485
## Describe the bug
`width_bucket` (Spark 3.5+) is wired through
`CometExprShim.versionSpecificExprToProtoInternal` rather than through a
`CometExpressionSerde` registered in `QueryPlanSerde.exprSerdeMap`. As a result:
- It bypasses the normal `getSupportLevel` / `getUnsupportedReasons` /
`getIncompatibleReasons` hooks, so it cannot signal any incompat or unsupported
branches.
- It is invisible to the auto-generated compatibility doc
(`docs/source/user-guide/compatibility.md`).
- It is invisible to the per-expression
`spark.comet.expression.<Name>.{enabled,allowIncompatible}` configs.
- The wiring is duplicated across four shim files (`spark-3.5`, `spark-4.0`,
`spark-4.1`, `spark-4.2`), so any future change has to be applied four times.
`width_bucket` also supports Spark's `YearMonthIntervalType` and
`DayTimeIntervalType`, but Comet's tests only cover `DoubleType`. The native
`SparkWidthBucket` declares the interval signatures, but the wiring gap means
there is no way to mark them as `Unsupported` if a future bug is found.
Surfaced by the math-expressions audit (collection PR queue).
## Expected behavior
Move `width_bucket` to a `CometExpressionSerde[WidthBucket]` registered in
`QueryPlanSerde.mathExpressions`, matching the pattern used by every other math
expression. The serde can either accept all types and forward to the native
UDF, or branch on input types and call `Unsupported` for unsupported cases.
## Additional context
- Shim location:
`spark/src/main/spark-3.5/org/apache/comet/shims/CometExprShim.scala` (plus
`spark-4.0`, `spark-4.1`, `spark-4.2`)
- Native UDF: datafusion-spark `SparkWidthBucket`, registered in
`native/core/src/execution/jni_api.rs`
- `width_bucket` is unsupported on Spark 3.4.3 (the function was added in
3.5).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]