[ 
https://issues.apache.org/jira/browse/SPARK-35653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-35653:
-------------------------------------
    Affects Version/s: 3.2.0
                       3.0.2

> [SQL] CatalystToExternalMap interpreted path fails for Map with case classes 
> as keys or values
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-35653
>                 URL: https://issues.apache.org/jira/browse/SPARK-35653
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.2, 3.1.2, 3.2.0
>            Reporter: Emil Ejbyfeldt
>            Priority: Major
>
> Interpreted path deserialization fails for Map with case classes as keys or 
> values while the codegen path works correctly.
> To reproduce the issue one can add test cases to the ExpressionEncoderSuite. 
> For example adding the following
> {noformat}
> case class IntAndString(i: Int, s: String)
> encodeDecodeTest(Map(1 -> IntAndString(1, "a")), "map with case class as 
> value")
> {noformat}
> It will succeed for the code gen path while the interpreted path will fail 
> with
> {noformat}
> [info] - encode/decode for map with case class as value: Map(1 -> 
> IntAndString(1,a)) (interpreted path) *** FAILED *** (64 milliseconds)
> [info] Encoded/Decoded data does not match input data
> [info]
> [info] in: Map(1 -> IntAndString(1,a))
> [info] out: Map(1 -> [1,a])
> [info] types: scala.collection.immutable.Map$Map1 [info]
> [info] Encoded Data: 
> [org.apache.spark.sql.catalyst.expressions.UnsafeMapData@5ecf5d9e]
> [info] Schema: value#823
> [info] root
> [info] -- value: map (nullable = true)
> [info] |-- key: integer
> [info] |-- value: struct (valueContainsNull = true)
> [info] | |-- i: integer (nullable = false)
> [info] | |-- s: string (nullable = true)
> [info]
> [info]
> [info] fromRow Expressions:
> [info] catalysttoexternalmap(lambdavariable(CatalystToExternalMap_key, 
> IntegerType, false, 178), lambdavariable(CatalystToExternalMap_key, 
> IntegerType, false, 178), lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179), 
> if (isnull(lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 
> 179))) null else newInstance(class 
> org.apache.spark.sql.catalyst.encoders.IntAndString), input[0, 
> map<int,struct<i:int,s:string>>, true], interface 
> scala.collection.immutable.Map
> [info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178)
> [info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178)
> [info] :- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
> [info] :- if (isnull(lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 
> 179))) null else newInstance(class 
> org.apache.spark.sql.catalyst.encoders.IntAndString)
> [info] : :- isnull(lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179))
> [info] : : +- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
> [info] : :- null
> [info] : +- newInstance(class 
> org.apache.spark.sql.catalyst.encoders.IntAndString)
> [info] : :- assertnotnull(lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 
> 179).i)
> [info] : : +- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).i
> [info] : : +- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
> [info] : +- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 
> 179).s.toString
> [info] : +- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).s
> [info] : +- lambdavariable(CatalystToExternalMap_value, 
> StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
> [info] +- input[0, map<int,struct<i:int,s:string>>, true] 
> (ExpressionEncoderSuite.scala:627)
> {noformat}
> So the value was not correctly deserialized in the interpreted path.
> I have prepared a PR that I will submit for fixing this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to