ulysses-you commented on PR #37284: URL: https://github.com/apache/spark/pull/37284#issuecomment-1195169526
the issue is not at physical. it is because we remove the local sort incorrectly since the logical global limit preserve the child output ordering. An example: ```sql create table t1 (c1 int) using parquet; // make two files set spark.sql.leafNodeDefaultParallelism=1; insert into table t1 values(2),(4),(3); insert into table t1 values(6),(2),(5); // make two map partitions set spark.sql.files.openCostInBytes=128MB; // 2 3 4 2 select * from (select * from t1 sort by c1 limit 4) sort by c1; // 2 2 3 4 set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts; select * from (select * from t1 sort by c1 limit 4) sort by c1; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
