andygrove opened a new pull request, #4473:
URL: https://github.com/apache/datafusion-comet/pull/4473

   ## Which issue does this PR close?
   
   Closes #.
   
   ## Rationale for this change
   
   Continuation of the per-category expression audit. Same pattern as #4470 
(json), #4469 (struct), #4461 (string), and earlier audits, using the updated 
`audit-comet-expression` skill in #4468 (now also covers Spark 4.1.1).
   
   ## What changes are included in this PR?
   
   ### Support-doc audit notes
   
   Add per-version audit sub-bullets to `concat`, `reverse`, and `size` in 
`docs/source/contributor-guide/spark_expressions_support.md`. Spark `Concat` 
and `Reverse` widen `StringType` inputs to 
`StringTypeWithCollation(supportsTrimCollation = true)` in 4.0; `Size` is 
byte-for-byte identical across all four versions.
   
   ### Support-level consistency fix (in `strings.scala`)
   
   - `CometConcat`: relabel the non-`StringType` branch from 
`Incompatible(Some(...))` to `Unsupported(Some(...))`. Spark accepts 
`BinaryType` and `ArrayType`, but Comet has no native path for either, so the 
user-observable effect is a fallback, not a wrong result. The reason string is 
now exposed via `getUnsupportedReasons` (rather than `getIncompatibleReasons`), 
and the constant is now `private val` for parity with other serdes.
   
   ### Tracking issues filed for follow-up
   
   - #4471 `[Feature] support concat() for BinaryType and ArrayType inputs` — 
Spark accepts both, Comet only supports `StringType` natively.
   - #4472 `[Feature] support size() for MapType inputs` — Spark accepts both 
`ArrayType` and `MapType`, Comet only supports `ArrayType`.
   
   The existing #2763 already covers `reverse` on `array<binary>` and is 
referenced from the doc.
   
   ### Audit process
   
   Audited directly using the `audit-comet-expression` skill (4 Spark versions 
per #4468). One backing serde per function, no parallel subagents needed.
   
   ## How are these changes tested?
   
   - `./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite 
string/concat" -Dtest=none` (2 tests pass)
   - `./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite 
array/array_concat" -Dtest=none` (1 test passes; uses the existing 
`expect_fallback(CONCAT supports only string input parameters)` directive — the 
reason text is preserved by the relabel, so substring matching still holds)
   - `./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite array/size" 
-Dtest=none` (1 test passes)
   - `./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite 
string/reverse" -Dtest=none` (1 test passes)
   - `make core` succeeds with the serde change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to