andygrove opened a new pull request, #4544:
URL: https://github.com/apache/datafusion-comet/pull/4544
## Which issue does this PR close?
Part of #4418.
## Rationale for this change
`dayname` and `monthname` (Spark 4.0+) had no Comet support and forced a
fallback to Spark. While auditing the datetime functions for this gap, it also
became clear that several functions marked as unsupported in the coverage
document already execute in Comet: Spark rewrites them to expressions Comet
already supports before the plan reaches Comet, so the coverage document
understated what works.
This PR closes the `dayname` / `monthname` gap and corrects the
documentation for the rewrite-backed functions, so users are not misled into
thinking common functions like `to_timestamp` or `current_timestamp` are
unaccelerated.
## What changes are included in this PR?
This work was scaffolded with the `implement-comet-expression` skill.
- **`dayname` / `monthname`**: these are Spark 4.0+ only, so they cannot be
referenced from shared serde code. They are wired in the spark-4.0
`CometExprShim` and routed through the Arrow-direct codegen dispatcher
(`CometScalaUDF.emitJvmCodegenDispatch`), which runs Spark's own `doGenCode`
for exact parity (including locale handling). When the dispatcher is disabled
they fall back cleanly.
- **Documentation**: `spark_expressions_support.md` is updated to mark the
following as supported, each with a note explaining the path:
- `to_date` / `to_timestamp` / `to_timestamp_ntz` / `to_timestamp_ltz`:
rewrite to `Cast` (no format) or `GetTimestamp` (with format).
- `make_timestamp_ntz` / `make_timestamp_ltz` (6-argument form): rewrite
to `MakeTimestamp`. The 2-argument `(date, time)` form needs the Spark 4.1 TIME
type and still falls back.
- `current_date` / `current_timestamp` / `now` / `curdate`:
constant-folded to literals by Spark's `ComputeCurrentTime` rule before Comet
sees the plan.
## How are these changes tested?
New tests in `CometExpressionSuite` use `checkSparkAnswerAndOperator` to
assert each function executes fully in Comet (no fallback) and returns
Spark-identical results, plus a constant-folding assertion for `current_date` /
`current_timestamp` / `now`. The `dayname` / `monthname` test is gated to Spark
4.0+. Verified locally against the Spark 4.0 profile.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]