alamb opened a new issue, #11413:
URL: https://github.com/apache/datafusion/issues/11413

   ### Is your feature request related to a problem or challenge?
   
   Related to the discussion on 
https://github.com/apache/datafusion/discussions/11192 with @Xuanwo
   
   RisingWave has a library for automatically creating vectorized 
implementations of functions (e.g. that operate on arrow arrays) from scalar 
implementations
   
   The library is here: https://github.com/risingwavelabs/arrow-udf
   
   A blog post describing it is here: 
https://risingwave.com/blog/simplifying-sql-function-implementation-with-rust-procedural-macro/
   
   DataFusion uses macros to do something similar in binary.rs but they are 
pretty hard to read / understand in my opinon: 
https://github.com/apache/datafusion/blob/7a23ea9bce32dc8ae195caa8ca052673031c06c9/datafusion/physical-expr/src/expressions/binary.rs#L118-L130
   
   One main benefit I can see to switching to 
https://github.com/risingwavelabs/arrow-udf is that we could then extend 
arrow-udf to support Dictionary and StringView and maybe other types to 
generate fast kernels for multiple different array layouts. 
   
   ### Describe the solution you'd like
   
   I think it would be great if someone could evaluate the feasibility of using 
the macros in https://github.com/risingwavelabs/arrow-udf to implement 
Datafusion's operations (and maybe eventually functions etc)
   
   
   
   ### Describe alternatives you've considered
   
   I suggest a POC that picks one or two functions (maybe string equality or 
regexp_match or something) and tries to use `arrow-udf`s function macro 
instead. 
   
   Here is an example of how to use it: 
https://docs.rs/arrow-udf/0.3.0/arrow_udf/
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to