JkSelf commented on issue #10103: URL: https://github.com/apache/incubator-gluten/issues/10103#issuecomment-3031305640
@wenfang6 Spark uses the `spark.sql.legacy.allowHashOnMapType` configuration to control support for the hash map key function. Gluten enables this feature by enabling the configuration when creating the ColumnarShuffleExchange https://github.com/apache/incubator-gluten/blob/main/backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxSparkPlanExecApi.scala#L355-L363, which helps bypass Spark's unresolved checks. However, your SQL plan contains two adjacent projects as shown below: ``` p1.projectList: 1. hash(features#2, geek_position#0) as hash_partition_key 2. features#2 3. geek_position#0 p2.projectList: 1. features#2 2. geek_position#0 ``` Gluten applies the `CollapseProjectExecTransformer `rule to combine these two projects into a single `collapseProject` https://github.com/apache/incubator-gluten/blob/main/gluten-substrait/src/main/scala/org/apache/gluten/extension/columnar/CollapseProjectExecTransformer.scala#L40-L41. Unfortunately, the `collapseProject ` remains unresolved because the configuration is set to false https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L291-L296. Could you please test by enabling `spark.sql.legacy.allowHashOnMapType ` configuration? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
