ulysses-you commented on PR #37284:
URL: https://github.com/apache/spark/pull/37284#issuecomment-1195169526

   the issue is not at physical. it is because we remove the local sort 
incorrectly since the logical global limit preserve the child output ordering.
   
   An example:
   ```sql
   create table t1 (c1 int) using parquet;
   
   // make two files
   set spark.sql.leafNodeDefaultParallelism=1;
   
   insert into table t1 values(2),(4),(3);
   insert into table t1 values(6),(2),(5);
   
   // make two map partitions
   set spark.sql.files.openCostInBytes=128MB;
   
   // 2 3 4 2
   select * from (select * from t1 sort by c1 limit 4) sort by c1;
   
   // 2 2 3 4
   set 
spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts;
   select * from (select * from t1 sort by c1 limit 4) sort by c1;
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to