Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
@mstewart141, just to be clear, the error:
```
ValueError: Function has keyword-only parameters or annotations, use
getfullargspec() API which can support them
```
is from deprecated `getargspec` instead of `getfullargspec` that's fixed by
you. Current error seems like this:
```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../spark/python/pyspark/sql/functions.py", line 2380, in
pandas_udf
return _create_udf(f=f, returnType=return_type, evalType=eval_type)
File "/.../spark/python/pyspark/sql/udf.py", line 51, in _create_udf
argspec = _get_argspec(f)
File "/.../spark/python/pyspark/util.py", line 60, in _get_argspec
argspec = inspect.getargspec(f)
File
"/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py",
line 818, in getargspec
raise TypeError('{!r} is not a Python function'.format(func))
TypeError: <functools.partial object at 0x1117dccb0> is not a Python
function
```
with the reproducer below:
```python
from functools import partial
from pyspark.sql.functions import pandas_udf
def test_func(a, b):
return a + b
pandas_udf(partial(test_func, b='id'), "string")
```
I think this should work like a normal udf
```python
from functools import partial
from pyspark.sql.functions import udf
def test_func(a, b):
return a + b
normal_udf = udf(partial(test_func, b='id'), "string")
df = spark.createDataFrame([["a"]])
df.select(normal_udf("_1")).show()
```
So, I think we should add the support for callable objects / partial
functions in Pandas UDFs. Would you be interested in filling JIRA(s) and
proceeding? If you are busy, I am willing to do it as well.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]