andygrove opened a new issue, #4617:
URL: https://github.com/apache/datafusion-comet/issues/4617

   ## What is the problem the feature request solves?
   
   Spark's array and map higher-order (lambda) functions currently have no 
Comet implementation, so any query using them falls back to Spark for the 
enclosing operator:
   
   - array: `transform`, `exists`, `forall`, `aggregate`/`reduce`, `array_sort` 
(with comparator), `zip_with`
   - map: `map_filter`, `transform_keys`, `transform_values`, `map_zip_with`
   
   These are hard to implement natively in Rust because they evaluate an 
arbitrary user lambda per element.
   
   ## Describe the potential solution
   
   The codegen dispatcher added for the regex/json families already admits 
`CodegenFallback` expressions, which includes all higher-order functions: 
`CometBatchKernelCodegen.canHandle` accepts them, and `CometCodegenHOFSuite` 
already proves `transform`/`filter`/`aggregate`/`exists` evaluate correctly 
inside the kernel when nested in a registered `ScalaUDF`.
   
   Wiring each HOF into the serde as a `CometCodegenDispatch` makes a top-level 
HOF projection stay native (running Spark's own per-element evaluation inside 
the Comet kernel) and match Spark exactly, falling back cleanly when the 
dispatcher is disabled.
   
   ## Additional context
   
   Identified while reviewing the codegen-dispatch work in #4538. Related 
testing-convention follow-up: #4616.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to