andygrove opened a new pull request, #4486:
URL: https://github.com/apache/datafusion-comet/pull/4486

   ## Which issue does this PR close?
   
   Closes #.
   
   ## Rationale for this change
   
   Continuation of the per-category expression audit. Same pattern as #4483 
(array), #4480 (predicate), #4479 (bitwise), #4478 (map), #4476 (hash), #4475 
(conditional), #4474 (misc), #4473 (collection), #4470 (json), #4469 (struct), 
using the updated `audit-comet-expression` skill in #4468.
   
   ## What changes are included in this PR?
   
   ### Support-doc audit notes
   
   Add per-version audit sub-bullets to all 60 supported math SQL function 
names (`%`, `*`, `+`, `-`, `/`, `abs`, all trig and inverse-trig, all 
log/exp/pow/sqrt/cbrt, `bin`, `hex`, `unhex`, `greatest`, `least`, 
`ceil`/`ceiling`/`floor`/`round`/`rint`/`sign`/`signum`, 
`div`/`mod`/`negative`/`positive`/`pi`, 
`try_add`/`try_divide`/`try_multiply`/`try_subtract`, `width_bucket`). Add 
4.1.1 audit line to the already-audited `factorial`. `rand` / `randn` 
cross-reference the misc audit (#4474); `shiftleft` cross-references the 
bitwise audit (#4479).
   
   Highlights from the cross-version review:
   
   - All trig and exp/log helpers are byte-for-byte stable across the four 
Spark versions; the only systemic change in 4.0 is the `NullIntolerant` trait 
-> `nullIntolerant: Boolean` field refactor.
   - 4.0 also adds the `DefaultStringProducingExpression` trait on `Bin` / 
`Hex` / `Unhex` and widens their `StringType` inputs to 
`StringTypeWithCollation`.
   - `UnaryPositive` becomes `RuntimeReplaceable` in 4.0; in 3.4/3.5 `+col` 
silently disables Comet for the enclosing projection because there is no 
`CometUnaryPositive` serde.
   - `CometRemainder` rejects `EvalMode.TRY`, so `try_mod` on Spark 4.0+ falls 
back (filed as #4484).
   - `Round` falls back for Float / Double because BigDecimal-via-toString 
rounding cannot be precisely matched.
   - `width_bucket` (Spark 3.5+) is wired via per-version `CometExprShim` 
rather than a `CometExpressionSerde`, bypassing the support-level framework 
(filed as #4485).
   - `hex` / `unhex` do not propagate Spark 4.x collation; covered by the 
umbrella #2190.
   
   ### Support-level consistency fixes
   
   None in this PR. The audit surfaced several `convert`-time `withInfo` 
fallbacks in `arithmetic.scala` (`CometAdd` / `CometSubtract` / `CometMultiply` 
/ `CometDivide` / `CometIntegralDivide` / `CometRemainder` / `CometUnaryMinus` 
/ `CometRound` rely on the default `Compatible(None)` and bail out from 
`convert` for several cases). Lifting them into `getSupportLevel` / 
`getUnsupportedReasons` is mechanical but touches an unusually load-bearing 
file; deliberately deferred to follow-ups to keep this audit PR low-risk.
   
   ### Tracking issues filed for follow-up
   
   - #4484 `try_mod` falls back to Spark because `CometRemainder` rejects 
`EvalMode.TRY`.
   - #4485 `width_bucket` bypasses the `CometExpressionSerde` framework.
   
   ### Audit process
   
   Audited using the `audit-comet-expression` skill (4 Spark versions per 
#4468), driven by 4 parallel agents covering arithmetic ops, trig + constants, 
log/exp/power/rounding, and misc (bin/hex/etc).
   
   ## How are these changes tested?
   
   - `make core` succeeds (no code changes; doc only).
   - Existing test coverage in `CometExpressionSuite`, 
`CometMathExpressionSuite`, `CometSqlFileTestSuite expressions/math/`, and 
`expressions/string/{hex,unhex}.sql` remains unchanged.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to