Max Härtwig created SPARK-30473:
-----------------------------------

             Summary: PySpark enum subclass crashes when used inside UDF
                 Key: SPARK-30473
                 URL: https://issues.apache.org/jira/browse/SPARK-30473
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.4.4
         Environment: Databricks Runtime 6.2 (includes Apache Spark 2.4.4, 
Scala 2.11)
            Reporter: Max Härtwig


PySpark enum subclass crashes when used inside a UDF.

 

Example:

 
{code:java}
from enum import Enum
class Direction(Enum):
    NORTH = 0
    SOUTH = 1
{code}
 

Working:

 
{code:java}
Direction.NORTH{code}
 

 

Crashing:

 
{code:java}
@udf
def fn(a):
    Direction.NORTH
    return ""

df.withColumn("test", fn("a")){code}
 

 

Stacktrace:

 
{noformat}
SparkException: Job aborted due to stage failure: Task 0 in stage 9.0 failed 4 
times, most recent failure: Lost task 0.3 in stage 9.0 (TID 235, 10.139.64.21, 
executor 0): org.apache.spark.api.python.PythonException: Traceback (most 
recent call last): File "/databricks/spark/python/pyspark/serializers.py", line 
182, in _read_with_length return self.loads(obj) File 
"/databricks/spark/python/pyspark/serializers.py", line 695, in loads return 
pickle.loads(obj, encoding=encoding) File 
"/databricks/python/lib/python3.7/enum.py", line 152, in __new__ enum_members = 
{k: classdict[k] for k in classdict._member_names} AttributeError: 'dict' 
object has no attribute '_member_names'{noformat}
 

 

I suspect the problem is in `python/pyspark/cloudpickle.py`. On line 586 in the 
function `_save_dynamic_enum`, the attribute `_member_names` is removed from 
the enum. Yet, this attribute is required by the `Enum` class and Enum 
subclasses will crash.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to