Hyukjin Kwon created SPARK-21394: ------------------------------------ Summary: Reviving broken callable objects in UDF in PySpark Key: SPARK-21394 URL: https://issues.apache.org/jira/browse/SPARK-21394 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 2.2.0, 2.3.0 Reporter: Hyukjin Kwon
After SPARK-19161, we happened to break callable objects as UDFs in Python as below: {code} >>> from pyspark.sql import functions >>> class F(object): ... def __call__(self, x): ... return x ... >>> foo = F() >>> foo(1) 1 >>> udf = functions.udf(foo) Traceback (most recent call last): File "<stdin>", line 1, in <module> File ".../spark/python/pyspark/sql/functions.py", line 2142, in udf return _udf(f=f, returnType=returnType) File ".../spark/python/pyspark/sql/functions.py", line 2133, in _udf return udf_obj._wrapped() File ".../spark/python/pyspark/sql/functions.py", line 2090, in _wrapped @functools.wraps(self.func) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 33, in update_wrapper setattr(wrapper, attr, getattr(wrapped, attr)) AttributeError: F instance has no attribute '__name__' {code} Note that this works in Spark 2.1 as below: {code} >>> from pyspark.sql import functions >>> class F(object): ... def __call__(self, x): ... return x ... >>> foo = F() >>> foo(1) 1 >>> udf = functions.udf(foo) >>> spark.range(1).select(udf("id")).show() +-----+ |F(id)| +-----+ | 0| +-----+ {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org