[ https://issues.apache.org/jira/browse/SPARK-35653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Takeshi Yamamuro updated SPARK-35653: ------------------------------------- Affects Version/s: 3.2.0 3.0.2 > [SQL] CatalystToExternalMap interpreted path fails for Map with case classes > as keys or values > ---------------------------------------------------------------------------------------------- > > Key: SPARK-35653 > URL: https://issues.apache.org/jira/browse/SPARK-35653 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.2, 3.1.2, 3.2.0 > Reporter: Emil Ejbyfeldt > Priority: Major > > Interpreted path deserialization fails for Map with case classes as keys or > values while the codegen path works correctly. > To reproduce the issue one can add test cases to the ExpressionEncoderSuite. > For example adding the following > {noformat} > case class IntAndString(i: Int, s: String) > encodeDecodeTest(Map(1 -> IntAndString(1, "a")), "map with case class as > value") > {noformat} > It will succeed for the code gen path while the interpreted path will fail > with > {noformat} > [info] - encode/decode for map with case class as value: Map(1 -> > IntAndString(1,a)) (interpreted path) *** FAILED *** (64 milliseconds) > [info] Encoded/Decoded data does not match input data > [info] > [info] in: Map(1 -> IntAndString(1,a)) > [info] out: Map(1 -> [1,a]) > [info] types: scala.collection.immutable.Map$Map1 [info] > [info] Encoded Data: > [org.apache.spark.sql.catalyst.expressions.UnsafeMapData@5ecf5d9e] > [info] Schema: value#823 > [info] root > [info] -- value: map (nullable = true) > [info] |-- key: integer > [info] |-- value: struct (valueContainsNull = true) > [info] | |-- i: integer (nullable = false) > [info] | |-- s: string (nullable = true) > [info] > [info] > [info] fromRow Expressions: > [info] catalysttoexternalmap(lambdavariable(CatalystToExternalMap_key, > IntegerType, false, 178), lambdavariable(CatalystToExternalMap_key, > IntegerType, false, 178), lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179), > if (isnull(lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, > 179))) null else newInstance(class > org.apache.spark.sql.catalyst.encoders.IntAndString), input[0, > map<int,struct<i:int,s:string>>, true], interface > scala.collection.immutable.Map > [info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178) > [info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178) > [info] :- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179) > [info] :- if (isnull(lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, > 179))) null else newInstance(class > org.apache.spark.sql.catalyst.encoders.IntAndString) > [info] : :- isnull(lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)) > [info] : : +- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179) > [info] : :- null > [info] : +- newInstance(class > org.apache.spark.sql.catalyst.encoders.IntAndString) > [info] : :- assertnotnull(lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, > 179).i) > [info] : : +- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).i > [info] : : +- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179) > [info] : +- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, > 179).s.toString > [info] : +- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).s > [info] : +- lambdavariable(CatalystToExternalMap_value, > StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179) > [info] +- input[0, map<int,struct<i:int,s:string>>, true] > (ExpressionEncoderSuite.scala:627) > {noformat} > So the value was not correctly deserialized in the interpreted path. > I have prepared a PR that I will submit for fixing this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org