Re: [I] [VL] Result mismatch found in FlushableAgg [incubator-gluten]

via GitHub Thu, 08 Aug 2024 00:13:54 -0700


zhztheplayer commented on issue #6630:
URL: 
https://github.com/apache/incubator-gluten/issues/6630#issuecomment-2275112442


   I managed to get a more similar case and still not reproduced the issue.
   
   ```sh
   # Generate partitioned data:
   tools/gluten-it/sbin/gluten-it.sh data-gen-only --local-cluster 
--auto-cluster-resource -s=100.0 --gen-partitioned-data
   tools/gluten-it/ sbin/gluten-it.sh spark-shell --local-cluster 
--auto-cluster-resource -s=100.0 --data-gen=skip
   
   # In opened Spark shell, run:
   spark sql "set spark.sql.adaptive.coalescePartitions.minPartitionSize=500m" 
show # force AQEShuffleReadExec
   spark sql "set spark.sql.autoBroadcastJoinThreshold=-1" show # disable bhj
   val df = spark sql "select * from (select distinct l_orderkey,l_partkey from 
lineitem) a inner join (select l_orderkey from lineitem limit 10) b on 
a.l_orderkey = b.l_orderkey limit 10" # run query
   df collect # execute
   df explain # explain
   ```
   
   And the plan explained is fine:
   
   
![cbe10c4162ef01d4ca4868e387d04ff](https://github.com/user-attachments/assets/556b40fb-796b-4f15-8bf2-f8ef867a9292)
   
   In debugger, AQEShuffleReadExec has correct outputPartitioning:
   
   
![92407d1a561fc31dc960d84eee93028](https://github.com/user-attachments/assets/ab8b5646-5660-4c06-9914-994f471d2815)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [VL] Result mismatch found in FlushableAgg [incubator-gluten]

Reply via email to