andygrove opened a new pull request, #4635:
URL: https://github.com/apache/datafusion-comet/pull/4635

   ## Which issue does this PR close?
   
   N/A. Tier 2 of expanding JVM codegen dispatch coverage (follow-on to the 
tier 1 PR).
   
   ## Rationale for this change
   
   `mask` and `map` are scalar expressions with supported output types that 
were falling back to Spark even though they are eligible for the codegen 
dispatch path (which runs Spark's own `doGenCode` inside the Comet pipeline for 
Spark-exact results).
   
   ## What changes are included in this PR?
   
   * `mask` (`Mask`): registered as `CometCodegenDispatch`.
   * `map` (`CreateMap`): registered as `CometCodegenDispatch`.
   
   `docs/source/user-guide/latest/expressions.md` flips both from Planned to 
Supported.
   
   Scoped down from the originally planned tier 2 set after empirical testing. 
The following were deferred because they need more than a one-line 
registration, and I would rather land them with proper handling:
   
   * `base64` / `encode`: on Spark 4.x these lower to `StaticInvoke` (codec 
object), not the `Base64` / `Encode` case classes, so they need the 
`StaticInvoke` allowlist in `statics.scala` extended (and the path differs 
across Spark versions).
   * `split_part`: rewrites to `element_at(StringSplitSQL(...))`. Dispatching 
`StringSplitSQL` then composing with `element_at` tripped a native panic 
(`Arrays with inconsistent types passed to MutableArrayData`), so it needs 
investigation.
   * `array_prepend`: `ArrayPrepend` only exists in Spark 3.5+, so it needs the 
version-specific expression map rather than the shared one.
   
   ## How are these changes tested?
   
   New Comet SQL file tests `string/mask.sql` and `map/create_map.sql`, run 
with `CometSqlFileTestSuite` and passing (native execution plus result match 
against Spark). Both exercise column inputs, literals, nulls, and the 
multi-argument forms.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to