zhztheplayer commented on issue #6630: URL: https://github.com/apache/incubator-gluten/issues/6630#issuecomment-2275112442
I managed to get a more similar case and still not reproduced the issue. ```sh # Generate partitioned data: tools/gluten-it/sbin/gluten-it.sh data-gen-only --local-cluster --auto-cluster-resource -s=100.0 --gen-partitioned-data tools/gluten-it/ sbin/gluten-it.sh spark-shell --local-cluster --auto-cluster-resource -s=100.0 --data-gen=skip # In opened Spark shell, run: spark sql "set spark.sql.adaptive.coalescePartitions.minPartitionSize=500m" show # force AQEShuffleReadExec spark sql "set spark.sql.autoBroadcastJoinThreshold=-1" show # disable bhj val df = spark sql "select * from (select distinct l_orderkey,l_partkey from lineitem) a inner join (select l_orderkey from lineitem limit 10) b on a.l_orderkey = b.l_orderkey limit 10" # run query df collect # execute df explain # explain ``` And the plan explained is fine:  In debugger, AQEShuffleReadExec has correct outputPartitioning:  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
