Emil Ejbyfeldt created SPARK-35653:
--------------------------------------

             Summary: [SQL] CatalystToExternalMap interpreted path fails for 
Map with case classes as keys or values
                 Key: SPARK-35653
                 URL: https://issues.apache.org/jira/browse/SPARK-35653
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.1.2
            Reporter: Emil Ejbyfeldt


Interpreted path deserialization fails for Map with case classes as keys or 
values while the codegen path works correctly.

To reproduce the issue one can add test cases to the ExpressionEncoderSuite. 
For example adding the following
{noformat}
case class IntAndString(i: Int, s: String)
encodeDecodeTest(Map(1 -> IntAndString(1, "a")), "map with case class as value")
{noformat}
It will succeed for the code gen path while the interpreted path will fail with
{noformat}
[info] - encode/decode for map with case class as value: Map(1 -> 
IntAndString(1,a)) (interpreted path) *** FAILED *** (64 milliseconds)
[info] Encoded/Decoded data does not match input data
[info]
[info] in: Map(1 -> IntAndString(1,a))
[info] out: Map(1 -> [1,a])
[info] types: scala.collection.immutable.Map$Map1 [info]
[info] Encoded Data: 
[org.apache.spark.sql.catalyst.expressions.UnsafeMapData@5ecf5d9e]
[info] Schema: value#823
[info] root
[info] -- value: map (nullable = true)
[info] |-- key: integer
[info] |-- value: struct (valueContainsNull = true)
[info] | |-- i: integer (nullable = false)
[info] | |-- s: string (nullable = true)
[info]
[info]
[info] fromRow Expressions:
[info] catalysttoexternalmap(lambdavariable(CatalystToExternalMap_key, 
IntegerType, false, 178), lambdavariable(CatalystToExternalMap_key, 
IntegerType, false, 178), lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179), 
if (isnull(lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179))) 
null else newInstance(class 
org.apache.spark.sql.catalyst.encoders.IntAndString), input[0, 
map<int,struct<i:int,s:string>>, true], interface scala.collection.immutable.Map
[info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178)
[info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178)
[info] :- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] :- if (isnull(lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179))) 
null else newInstance(class org.apache.spark.sql.catalyst.encoders.IntAndString)
[info] : :- isnull(lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179))
[info] : : +- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] : :- null
[info] : +- newInstance(class 
org.apache.spark.sql.catalyst.encoders.IntAndString)
[info] : :- assertnotnull(lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).i)
[info] : : +- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).i
[info] : : +- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] : +- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 
179).s.toString
[info] : +- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).s
[info] : +- lambdavariable(CatalystToExternalMap_value, 
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] +- input[0, map<int,struct<i:int,s:string>>, true] 
(ExpressionEncoderSuite.scala:627)
{noformat}
So the value was not correctly deserialized in the interpreted path.

I have prepared a PR that I will submit for fixing this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to