Philip Kahn created SPARK-50538:
-----------------------------------

             Summary: Operations on a dataset with void/nulltype fail
                 Key: SPARK-50538
                 URL: https://issues.apache.org/jira/browse/SPARK-50538
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 3.5.0
         Environment: {color:#11171c}Driver: r5d.8xlarge · Workers: r5d.8xlarge 
· 1-4 workers · On-demand and Spot · fall back to On-demand · DBR: 15.4 LTS 
(includes Apache Spark 3.5.0, Scala 2.12) · auto{color}
            Reporter: Philip Kahn


If a column is of data type `void` / `NullType()` , basic operations fail. 
These should behave as columns of entirely `null` data of a bottom type, 
implicitly or explicitly castable to any other type in the event of an 
insert/union/etc.

 

Something like select, union, or save raises `[INTERNAL_ERROR]` with the column 
not found, or methods over dataframes like `isEmpty()` fail, with errors 
similar to

 

{{Py4JJavaError{color:#555555}: An error occurred while calling o2303.isEmpty. 
: org.apache.spark.SparkException: [INTERNAL_ERROR] Couldn't find 
COLUMNNAME#REFNUMBER in 
[...,_databricks_internal_edge_computed_column_skip_row#17633] SQLSTATE: 
XX000{color}}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to