Philip Kahn created SPARK-50538:
-----------------------------------
Summary: Operations on a dataset with void/nulltype fail
Key: SPARK-50538
URL: https://issues.apache.org/jira/browse/SPARK-50538
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 3.5.0
Environment: {color:#11171c}Driver: r5d.8xlarge · Workers: r5d.8xlarge
· 1-4 workers · On-demand and Spot · fall back to On-demand · DBR: 15.4 LTS
(includes Apache Spark 3.5.0, Scala 2.12) · auto{color}
Reporter: Philip Kahn
If a column is of data type `void` / `NullType()` , basic operations fail.
These should behave as columns of entirely `null` data of a bottom type,
implicitly or explicitly castable to any other type in the event of an
insert/union/etc.
Something like select, union, or save raises `[INTERNAL_ERROR]` with the column
not found, or methods over dataframes like `isEmpty()` fail, with errors
similar to
{{Py4JJavaError{color:#555555}: An error occurred while calling o2303.isEmpty.
: org.apache.spark.SparkException: [INTERNAL_ERROR] Couldn't find
COLUMNNAME#REFNUMBER in
[...,_databricks_internal_edge_computed_column_skip_row#17633] SQLSTATE:
XX000{color}}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]