westonpace commented on PR #13500:
URL: https://github.com/apache/arrow/pull/13500#issuecomment-1176780265

   The approach, if I'm understanding correctly, is to use C++ to make two 
passes through the plan (or maybe its one pass).  The first pass gets all the 
UDFs out of the plan.  Pyarrow then unpickles and registers those UDFs.  The 
second actually consumes the plan, using a registry that contains those 
unpickled functions.
   
   This wouldn't be my first approach.  I think I'd prefer adding another 
callback like the consumer_factory for UDF handling.  This would make it easier 
to handle situations where there are alternative UDF handlers.  Or, for 
example, a C++ or R user that still wants to be able to run python UDFs.  
However, I'm not opposed to this approach.  The end pyarrow interface to the 
user is still just "substrait in->data out" so if we wanted to move to a 
different approach in the future that would be fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to