Emil Ejbyfeldt created SPARK-35653:
--------------------------------------
Summary: [SQL] CatalystToExternalMap interpreted path fails for
Map with case classes as keys or values
Key: SPARK-35653
URL: https://issues.apache.org/jira/browse/SPARK-35653
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.1.2
Reporter: Emil Ejbyfeldt
Interpreted path deserialization fails for Map with case classes as keys or
values while the codegen path works correctly.
To reproduce the issue one can add test cases to the ExpressionEncoderSuite.
For example adding the following
{noformat}
case class IntAndString(i: Int, s: String)
encodeDecodeTest(Map(1 -> IntAndString(1, "a")), "map with case class as value")
{noformat}
It will succeed for the code gen path while the interpreted path will fail with
{noformat}
[info] - encode/decode for map with case class as value: Map(1 ->
IntAndString(1,a)) (interpreted path) *** FAILED *** (64 milliseconds)
[info] Encoded/Decoded data does not match input data
[info]
[info] in: Map(1 -> IntAndString(1,a))
[info] out: Map(1 -> [1,a])
[info] types: scala.collection.immutable.Map$Map1 [info]
[info] Encoded Data:
[org.apache.spark.sql.catalyst.expressions.UnsafeMapData@5ecf5d9e]
[info] Schema: value#823
[info] root
[info] -- value: map (nullable = true)
[info] |-- key: integer
[info] |-- value: struct (valueContainsNull = true)
[info] | |-- i: integer (nullable = false)
[info] | |-- s: string (nullable = true)
[info]
[info]
[info] fromRow Expressions:
[info] catalysttoexternalmap(lambdavariable(CatalystToExternalMap_key,
IntegerType, false, 178), lambdavariable(CatalystToExternalMap_key,
IntegerType, false, 178), lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179),
if (isnull(lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)))
null else newInstance(class
org.apache.spark.sql.catalyst.encoders.IntAndString), input[0,
map<int,struct<i:int,s:string>>, true], interface scala.collection.immutable.Map
[info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178)
[info] :- lambdavariable(CatalystToExternalMap_key, IntegerType, false, 178)
[info] :- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] :- if (isnull(lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)))
null else newInstance(class org.apache.spark.sql.catalyst.encoders.IntAndString)
[info] : :- isnull(lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179))
[info] : : +- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] : :- null
[info] : +- newInstance(class
org.apache.spark.sql.catalyst.encoders.IntAndString)
[info] : :- assertnotnull(lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).i)
[info] : : +- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).i
[info] : : +- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] : +- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true,
179).s.toString
[info] : +- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179).s
[info] : +- lambdavariable(CatalystToExternalMap_value,
StructField(i,IntegerType,false), StructField(s,StringType,true), true, 179)
[info] +- input[0, map<int,struct<i:int,s:string>>, true]
(ExpressionEncoderSuite.scala:627)
{noformat}
So the value was not correctly deserialized in the interpreted path.
I have prepared a PR that I will submit for fixing this issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]