Bruce Robbins created SPARK-45896: ------------------------------------- Summary: Expression encoding fails for Seq/Map of Option[Seq] Key: SPARK-45896 URL: https://issues.apache.org/jira/browse/SPARK-45896 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.5.0, 3.4.1 Reporter: Bruce Robbins
The following action fails on 3.4.1, 3.5.0, and master: {noformat} scala> val df = Seq(Seq(Some(Seq(0)))).toDF("a") val df = Seq(Seq(Some(Seq(0)))).toDF("a") org.apache.spark.SparkRuntimeException: [EXPRESSION_ENCODING_FAILED] Failed to encode a value of the expressions: mapobjects(lambdavariable(MapObject, ObjectType(class java.lang.Object), true, -1), mapobjects(lambdavariable(MapObject, ObjectType(class java.lang.Object), true, -2), assertnotnull(validateexternaltype(lambdavariable(MapObject, ObjectType(class java.lang.Object), true, -2), IntegerType, IntegerType)), unwrapoption(ObjectType(interface scala.collection.immutable.Seq), validateexternaltype(lambdavariable(MapObject, ObjectType(class java.lang.Object), true, -1), ArrayType(IntegerType,false), ObjectType(class scala.Option))), None), input[0, scala.collection.immutable.Seq, true], None) AS value#0 to a row. SQLSTATE: 42846 ... Caused by: java.lang.RuntimeException: scala.Some is not a valid external type for schema of array<int> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.MapObjects_0$(Unknown Source) ... {noformat} However, it succeeds on 3.3.3: {noformat} scala> val df = Seq(Seq(Some(Seq(0)))).toDF("a") df: org.apache.spark.sql.DataFrame = [a: array<array<int>>] scala> df.collect res0: Array[org.apache.spark.sql.Row] = Array([WrappedArray(WrappedArray(0))]) {noformat} Map of option of sequence also fails on 3.4.1, 3.5.0, and master: {noformat} scala> val df = Seq(Map(0 -> Some(Seq(0)))).toDF("a") val df = Seq(Map(0 -> Some(Seq(0)))).toDF("a") org.apache.spark.SparkRuntimeException: [EXPRESSION_ENCODING_FAILED] Failed to encode a value of the expressions: externalmaptocatalyst(lambdavariable(ExternalMapToCatalyst_key, ObjectType(class java.lang.Object), false, -1), assertnotnull(validateexternaltype(lambdavariable(ExternalMapToCatalyst_key, ObjectType(class java.lang.Object), false, -1), IntegerType, IntegerType)), lambdavariable(ExternalMapToCatalyst_value, ObjectType(class java.lang.Object), true, -2), mapobjects(lambdavariable(MapObject, ObjectType(class java.lang.Object), true, -3), assertnotnull(validateexternaltype(lambdavariable(MapObject, ObjectType(class java.lang.Object), true, -3), IntegerType, IntegerType)), unwrapoption(ObjectType(interface scala.collection.immutable.Seq), validateexternaltype(lambdavariable(ExternalMapToCatalyst_value, ObjectType(class java.lang.Object), true, -2), ArrayType(IntegerType,false), ObjectType(class scala.Option))), None), input[0, scala.collection.immutable.Map, true]) AS value#0 to a row. SQLSTATE: 42846 ... Caused by: java.lang.RuntimeException: scala.Some is not a valid external type for schema of array<int> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.MapObjects_0$(Unknown Source) ... {noformat} As with the first example, this succeeds on 3.3.3: {noformat} scala> val df = Seq(Map(0 -> Some(Seq(0)))).toDF("a") df: org.apache.spark.sql.DataFrame = [a: map<int,array<int>>] scala> df.collect res0: Array[org.apache.spark.sql.Row] = Array([Map(0 -> WrappedArray(0))]) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org