[ 
https://issues.apache.org/jira/browse/SPARK-30473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-30473.
----------------------------------
    Resolution: Cannot Reproduce

> PySpark enum subclass crashes when used inside UDF
> --------------------------------------------------
>
>                 Key: SPARK-30473
>                 URL: https://issues.apache.org/jira/browse/SPARK-30473
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.4.4
>         Environment: Databricks Runtime 6.2 (includes Apache Spark 2.4.4, 
> Scala 2.11)
>            Reporter: Max Härtwig
>            Priority: Major
>
> PySpark enum subclass crashes when used inside a UDF.
>  
> Example:
> {code:java}
> from enum import Enum
> class Direction(Enum):
>     NORTH = 0
>     SOUTH = 1
> {code}
>  
> Working:
> {code:java}
> Direction.NORTH{code}
>  
> Crashing:
> {code:java}
> @udf
> def fn(a):
>     Direction.NORTH
>     return ""
> df.withColumn("test", fn("a")){code}
>  
> Stacktrace:
> {noformat}
> SparkException: Job aborted due to stage failure: Task 0 in stage 9.0 failed 
> 4 times, most recent failure: Lost task 0.3 in stage 9.0 (TID 235, 
> 10.139.64.21, executor 0): org.apache.spark.api.python.PythonException: 
> Traceback (most recent call last):
>     File "/databricks/spark/python/pyspark/serializers.py", line 182, in 
> _read_with_length return self.loads(obj)
>     File "/databricks/spark/python/pyspark/serializers.py", line 695, in 
> loads return pickle.loads(obj, encoding=encoding)
>     File "/databricks/python/lib/python3.7/enum.py", line 152, in __new__ 
> enum_members = {k: classdict[k] for k in classdict._member_names}
> AttributeError: 'dict' object has no attribute '_member_names'{noformat}
>  
> I suspect the problem is in *python/pyspark/cloudpickle.py*. On line 586 in 
> the function *_save_dynamic_enum*, the attribute *_member_names* is removed 
> from the enum. Yet, this attribute is required by the *Enum* class. This 
> results in all Enum subclasses crashing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to