[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-21394: ------------------------------------ Assignee: Apache Spark > Reviving broken callable objects in UDF in PySpark > -------------------------------------------------- > > Key: SPARK-21394 > URL: https://issues.apache.org/jira/browse/SPARK-21394 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.2.0, 2.3.0 > Reporter: Hyukjin Kwon > Assignee: Apache Spark > > After SPARK-19161, we happened to break callable objects as UDFs in Python as > below: > {code} > >>> from pyspark.sql import functions > >>> class F(object): > ... def __call__(self, x): > ... return x > ... > >>> foo = F() > >>> foo(1) > 1 > >>> udf = functions.udf(foo) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File ".../spark/python/pyspark/sql/functions.py", line 2142, in udf > return _udf(f=f, returnType=returnType) > File ".../spark/python/pyspark/sql/functions.py", line 2133, in _udf > return udf_obj._wrapped() > File ".../spark/python/pyspark/sql/functions.py", line 2090, in _wrapped > @functools.wraps(self.func) > File > "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", > line 33, in update_wrapper > setattr(wrapper, attr, getattr(wrapped, attr)) > AttributeError: F instance has no attribute '__name__' > {code} > Note that this works in Spark 2.1 as below: > {code} > >>> from pyspark.sql import functions > >>> class F(object): > ... def __call__(self, x): > ... return x > ... > >>> foo = F() > >>> foo(1) > 1 > >>> udf = functions.udf(foo) > >>> spark.range(1).select(udf("id")).show() > +-----+ > |F(id)| > +-----+ > | 0| > +-----+ > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org