Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/21650
  
    I had an idea of a slightly different approach.. Would it be possible to 
"promote" the regular `udf` to a `pandas_udf`?  By this I mean wrap the 
function using `apply()` so that it takes pd.Series as inputs and returns 
another pd.Series.  Then we can send the entire mix of `udf`s and `pandas_udf`s 
to the worker in one shot, instead of separate evaluations.  Since the user is 
already are using `pandas_udf`s we know that the worker supports it and I think 
the performance would be much better.  Is there any downside or issues with 
doing it this way?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to