[GitHub] spark issue #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF...

BryanCutler Tue, 03 Jul 2018 13:46:18 -0700

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/21650
  
    I had an idea of a slightly different approach.. Would it be possible to 
"promote" the regular `udf` to a `pandas_udf`?  By this I mean wrap the 
function using `apply()` so that it takes pd.Series as inputs and returns 
another pd.Series.  Then we can send the entire mix of `udf`s and `pandas_udf`s 
to the worker in one shot, instead of separate evaluations.  Since the user is 
already are using `pandas_udf`s we know that the worker supports it and I think 
the performance would be much better.  Is there any downside or issues with 
doing it this way?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF...

Reply via email to