[ 
https://issues.apache.org/jira/browse/SPARK-45896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce Robbins updated SPARK-45896:
----------------------------------
    Summary: Expression encoding fails for Seq/Map of 
Option[Seq/Date/Timestamp/BigDecimal]  (was: Expression encoding fails for 
Seq/Map of Option[Seq])

> Expression encoding fails for Seq/Map of Option[Seq/Date/Timestamp/BigDecimal]
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-45896
>                 URL: https://issues.apache.org/jira/browse/SPARK-45896
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.1, 3.5.0
>            Reporter: Bruce Robbins
>            Priority: Major
>
> The following action fails on 3.4.1, 3.5.0, and master:
> {noformat}
> scala> val df = Seq(Seq(Some(Seq(0)))).toDF("a")
> val df = Seq(Seq(Some(Seq(0)))).toDF("a")
> org.apache.spark.SparkRuntimeException: [EXPRESSION_ENCODING_FAILED] Failed 
> to encode a value of the expressions: mapobjects(lambdavariable(MapObject, 
> ObjectType(class java.lang.Object), true, -1), 
> mapobjects(lambdavariable(MapObject, ObjectType(class java.lang.Object), 
> true, -2), assertnotnull(validateexternaltype(lambdavariable(MapObject, 
> ObjectType(class java.lang.Object), true, -2), IntegerType, IntegerType)), 
> unwrapoption(ObjectType(interface scala.collection.immutable.Seq), 
> validateexternaltype(lambdavariable(MapObject, ObjectType(class 
> java.lang.Object), true, -1), ArrayType(IntegerType,false), ObjectType(class 
> scala.Option))), None), input[0, scala.collection.immutable.Seq, true], None) 
> AS value#0 to a row. SQLSTATE: 42846
> ...
> Caused by: java.lang.RuntimeException: scala.Some is not a valid external 
> type for schema of array<int>
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.MapObjects_0$(Unknown
>  Source)
> ...
> {noformat}
> However, it succeeds on 3.3.3:
> {noformat}
> scala> val df = Seq(Seq(Some(Seq(0)))).toDF("a")
> df: org.apache.spark.sql.DataFrame = [a: array<array<int>>]
> scala> df.collect
> res0: Array[org.apache.spark.sql.Row] = Array([WrappedArray(WrappedArray(0))])
> {noformat}
> Map of Option[Seq] also fails on 3.4.1, 3.5.0, and master:
> {noformat}
> scala> val df = Seq(Map(0 -> Some(Seq(0)))).toDF("a")
> val df = Seq(Map(0 -> Some(Seq(0)))).toDF("a")
> org.apache.spark.SparkRuntimeException: [EXPRESSION_ENCODING_FAILED] Failed 
> to encode a value of the expressions: 
> externalmaptocatalyst(lambdavariable(ExternalMapToCatalyst_key, 
> ObjectType(class java.lang.Object), false, -1), 
> assertnotnull(validateexternaltype(lambdavariable(ExternalMapToCatalyst_key, 
> ObjectType(class java.lang.Object), false, -1), IntegerType, IntegerType)), 
> lambdavariable(ExternalMapToCatalyst_value, ObjectType(class 
> java.lang.Object), true, -2), mapobjects(lambdavariable(MapObject, 
> ObjectType(class java.lang.Object), true, -3), 
> assertnotnull(validateexternaltype(lambdavariable(MapObject, ObjectType(class 
> java.lang.Object), true, -3), IntegerType, IntegerType)), 
> unwrapoption(ObjectType(interface scala.collection.immutable.Seq), 
> validateexternaltype(lambdavariable(ExternalMapToCatalyst_value, 
> ObjectType(class java.lang.Object), true, -2), ArrayType(IntegerType,false), 
> ObjectType(class scala.Option))), None), input[0, 
> scala.collection.immutable.Map, true]) AS value#0 to a row. SQLSTATE: 42846
> ...
> Caused by: java.lang.RuntimeException: scala.Some is not a valid external 
> type for schema of array<int>
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.MapObjects_0$(Unknown
>  Source)
> ...
> {noformat}
> As with the first example, this succeeds on 3.3.3:
> {noformat}
> scala> val df = Seq(Map(0 -> Some(Seq(0)))).toDF("a")
> df: org.apache.spark.sql.DataFrame = [a: map<int,array<int>>]
> scala> df.collect
> res0: Array[org.apache.spark.sql.Row] = Array([Map(0 -> WrappedArray(0))])
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to