timsaucer opened a new issue, #1017:
URL: https://github.com/apache/datafusion-python/issues/1017

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   Suppose someone wants to build a library that is usable by both rust and 
python DataFusion users. They have written a UDF in rust and it implements the 
rust DataFusion traits (whether scalar, aggregate, or window). Right now, if 
that user wants to use their UDF in `datafusion-python`, they need to expose a 
variety of methods that basically mimic the trait functions of the rust code. 
For scalar UDFs the interface requires a bit of wrangling from ColumnarValue to 
PyArrow objects.
   
   While it is possible to do this, it is likely error prone and tedious for 
implementers to write and maintain this code.
   
   **Describe the solution you'd like**
   
   We have an established pattern of adding foreign table providers via FFI 
interface and using PyCapsule. This makes adding a TableProvider a very easy 
operation. In our example code, the function to expose a table provider is only 
6 lines of code and likely will require minimal maintenance.
   
   It would be nice to expose all of the varieties of user defined functions 
via FFI to make this follow the established pattern and also easy for users to 
reuse their code.
   
   **Describe alternatives you've considered**
   
   I did a brief proof of concept where I used python calls to the required 
functions. This did work, but it took quite a bit of code and I suspect it will 
be difficult to maintain.
   
   **Additional context**
   
   This may provide additional value in that it would get us much closer to 
being able to expose a `SessionContext` via ffi, which would have nice impacts 
to both the datafusion-ray and ballista projects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to