dongjoon-hyun opened a new pull request #29118: URL: https://github.com/apache/spark/pull/29118
### What changes were proposed in this pull request? This PR aims to add a test case to EliminateSortsSuite to protect a valid use case which is using ORDER BY in DISTRIBUTE BY statement. ### Why are the changes needed? ``` scala> scala.util.Random.shuffle((1 to 100000).map(x => (x % 2, x))).toDF("a", "b").repartition(2).createOrReplaceTempView("t") scala> sql("select * from (select * from t order by b) distribute by a").write.orc("/tmp/master") $ ls -al /tmp/master/ total 56 drwxr-xr-x 10 dongjoon wheel 320 Jul 14 22:12 ./ drwxrwxrwt 15 root wheel 480 Jul 14 22:12 ../ -rw-r--r-- 1 dongjoon wheel 8 Jul 14 22:12 ._SUCCESS.crc -rw-r--r-- 1 dongjoon wheel 12 Jul 14 22:12 .part-00000-2cd3a50e-eded-49a4-b7cf-94e3f090b8c1-c000.snappy.orc.crc -rw-r--r-- 1 dongjoon wheel 16 Jul 14 22:12 .part-00043-2cd3a50e-eded-49a4-b7cf-94e3f090b8c1-c000.snappy.orc.crc -rw-r--r-- 1 dongjoon wheel 16 Jul 14 22:12 .part-00191-2cd3a50e-eded-49a4-b7cf-94e3f090b8c1-c000.snappy.orc.crc -rw-r--r-- 1 dongjoon wheel 0 Jul 14 22:12 _SUCCESS -rw-r--r-- 1 dongjoon wheel 119 Jul 14 22:12 part-00000-2cd3a50e-eded-49a4-b7cf-94e3f090b8c1-c000.snappy.orc -rw-r--r-- 1 dongjoon wheel 932 Jul 14 22:12 part-00043-2cd3a50e-eded-49a4-b7cf-94e3f090b8c1-c000.snappy.orc -rw-r--r-- 1 dongjoon wheel 939 Jul 14 22:12 part-00191-2cd3a50e-eded-49a4-b7cf-94e3f090b8c1-c000.snappy.orc ``` If we remove the inner `ORDER BY`, the file size increases. ``` scala> scala.util.Random.shuffle((1 to 100000).map(x => (x % 2, x))).toDF("a", "b").repartition(2).createOrReplaceTempView("t") scala> sql("select * from (select * from t order by b) distribute by a").write.orc("/tmp/SPARK-32276") $ ls -al /tmp/SPARK-32276/ total 632 drwxr-xr-x 10 dongjoon wheel 320 Jul 14 22:08 ./ drwxrwxrwt 14 root wheel 448 Jul 14 22:08 ../ -rw-r--r-- 1 dongjoon wheel 8 Jul 14 22:08 ._SUCCESS.crc -rw-r--r-- 1 dongjoon wheel 12 Jul 14 22:08 .part-00000-ba5049f9-b835-49b7-9fdb-bdd11b9891cb-c000.snappy.orc.crc -rw-r--r-- 1 dongjoon wheel 1188 Jul 14 22:08 .part-00043-ba5049f9-b835-49b7-9fdb-bdd11b9891cb-c000.snappy.orc.crc -rw-r--r-- 1 dongjoon wheel 1188 Jul 14 22:08 .part-00191-ba5049f9-b835-49b7-9fdb-bdd11b9891cb-c000.snappy.orc.crc -rw-r--r-- 1 dongjoon wheel 0 Jul 14 22:08 _SUCCESS -rw-r--r-- 1 dongjoon wheel 119 Jul 14 22:08 part-00000-ba5049f9-b835-49b7-9fdb-bdd11b9891cb-c000.snappy.orc -rw-r--r-- 1 dongjoon wheel 150735 Jul 14 22:08 part-00043-ba5049f9-b835-49b7-9fdb-bdd11b9891cb-c000.snappy.orc -rw-r--r-- 1 dongjoon wheel 150741 Jul 14 22:08 part-00191-ba5049f9-b835-49b7-9fdb-bdd11b9891cb-c000.snappy.orc ``` ### Does this PR introduce _any_ user-facing change? No. This only improves the test coverage. ### How was this patch tested? Pass the GitHub Action or Jenkins. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org