zml1206 opened a new issue, #6688: URL: https://github.com/apache/incubator-gluten/issues/6688
### Description Spark collect is `TypedImperativeAggregate` which can use objectHashAggregate, but velox collect is `DeclarativeAggregate`. `CollectRewriteRule` is a logical plan rule, it will replace spark collect to velox collect. If a collect aggregate is `ObjectHashAggregate` in vanilla spark, it will change to `SortAggregate` in gluten. The performance of `ObjectHashAggregate` is much greater than `SortAggregate`, if fallback it will cause performance regression. Add config `spark.gluten.sql.columnar.backend.velox.objectHashAggregate.collect.rewrite.enabled`, it can disable rewrite collect in objectHashAggregate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
