zhztheplayer commented on issue #8828: URL: https://github.com/apache/incubator-gluten/issues/8828#issuecomment-2685156847
> For the same key, native may put it in partition_i, but vanilla puts it in partition_j, where i!=j. My suggestion will be having the exchange's hashing algorithm following Spark's. murmur3 hash isn't it? cc @marin-ma > With AQE enabled, we only know one side, and don't know whether the next node after the shuffle is a join operator. I totally agree this kind of “cross stage” optimization made even difficult when AQE is on. My experience is to better keep our columnar rules away from altering the original query plan node's output ordering / partitioning or so. As long as we can make sure a columnar operator produce exactly the same result with the vanilla one, everything in the query optimization will get simpler because we don't have to consider the co-optimization among query plan nodes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
