APeng Zhang created SPARK-30408: ----------------------------------- Summary: orderBy in sortBy clause is removed by EliminateSorts Key: SPARK-30408 URL: https://issues.apache.org/jira/browse/SPARK-30408 Project: Spark Issue Type: Bug Components: Optimizer Affects Versions: 2.4.4, 2.4.3, 2.4.2, 2.4.1, 2.4.0 Reporter: APeng Zhang
OrderBy in sortBy clause will be removed by EliminateSorts. code to reproduce: {code:java} val dataset = Seq( ("a", 1, 4), ("b", 2, 5), ("c", 3, 6) ).toDF("a", "b", "c") val groupData = dataset.orderBy("b") val sortData = groupData.sortWithinPartitions("c") {code} The content of groupData is: {code:java} partition 0: [a,1,4] partition 1: [b,2,5] partition 2: [c,3,6]{code} The content of sortData is: {code:java} partition 0: [a,1,4] partition 1: [b,2,5], [c,3,6]{code} UT to cover this defect: In EliminateSortsSuite.scala {code:java} test("should not remove orderBy in sortBy clause") { val plan = testRelation.orderBy('a.asc).sortBy('b.desc) val optimized = Optimize.execute(plan.analyze) val correctAnswer = testRelation.orderBy('a.asc).sortBy('b.desc).analyze comparePlans(optimized, correctAnswer) }{code} This test will be failed because sortBy was removed by EliminateSorts. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org