viirya opened a new issue, #925: URL: https://github.com/apache/datafusion-comet/issues/925
### Describe the bug In #924, we found that Spark sometimes produces exchange partitioning where the partitioning expression cannot be resolved correctly. For example: ``` +- TransformWithState value#667.toString, newInstance(class org.apache.spark.sql.streaming.InputMapRow), [value#667], [key#659, action#660, value#661], org.apache.spark.sql.streaming.TestMapStateProcessor@58fc42f6, NoTime, Append, class[value[0]: string], obj#671: scala.Tuple3, state info [ checkpoint = , runId = 9af20b3e-feb8-4ccd-a9f0-b3ed1517330a, opId = 0, ver = 0, numPartitions = 5], 1725862230745, false, false, [value#667], [key#659, action#660, value# 661], value#667.toString :- Sort [value#667 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(value#667, 5), ENSURE_REQUIREMENTS, [plan_id=1124] : +- AppendColumns org.apache.spark.sql.streaming.TransformWithMapStateSuite$$Lambda$2590/0x000000f801e1c3d0@488fe08d, newInstance(class org.apache.spark.sql.streaming.InputMapRow), [staticinvoke(class org.apache.spark.unsaf e.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true, false, true) AS value#667] : +- LocalTableScan [key#659, action#660, value#661] +- !Sort [value#667 ASC NULLS FIRST], false, 0 +- !Exchange hashpartitioning(value#667, 5), ENSURE_REQUIREMENTS, [plan_id=1125] +- LocalTableScan <empty>, [value#672] ``` It causes resolution error in Comet when Comet tries to translate partitioning expressions: ``` [info] - transformWithMapState - batch should succeed (without changelog checkpointing) *** FAILED *** (23 milliseconds) [info] org.apache.spark.SparkException: [INTERNAL_ERROR] Couldn't find value#667 in [value#672] SQLSTATE: XX000 [info] at org.apache.spark.SparkException$.internalError(SparkException.scala:92) [info] at org.apache.spark.SparkException$.internalError(SparkException.scala:96) [info] at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:81) [info] at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:74) [info] at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:458) [info] at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:84) [info] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:458) [info] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:434) [info] at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:402) [info] at org.apache.spark.sql.catalyst.expressions.BindReferences$.bindReference(BoundAttribute.scala:74) [info] at org.apache.comet.serde.QueryPlanSerde$.exprToProtoInternal$1(QueryPlanSerde.scala:1714) [info] at org.apache.comet.serde.QueryPlanSerde$.exprToProto(QueryPlanSerde.scala:2565) [info] at org.apache.comet.serde.QueryPlanSerde$.$anonfun$supportPartitioning$1(QueryPlanSerde.scala:3184) ``` ### Steps to reproduce _No response_ ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org