zml1206 commented on PR #9872:
URL:
https://github.com/apache/incubator-gluten/pull/9872#issuecomment-2948622389
@FelixYBW Adding output ordering can reduce unnecessary sorting. For example:
```
spark.range(10).write.parquet("tmp/t1")
spark.range(10).write.parquet("tmp/t2")
spark.range(100).write.parquet("tmp/t3")
spark.range(10).write.parquet("tmp/t4")
spark.read.parquet("tmp/t1").createOrReplaceTempView("t1")
spark.read.parquet("tmp/t2").createOrReplaceTempView("t2")
spark.read.parquet("tmp/t3").createOrReplaceTempView("t3")
spark.read.parquet("tmp/t4").createOrReplaceTempView("t4")
sql("set spark.gluten.sql.columnar.forceShuffledHashJoin=false")
val query = """
|select /*+ MERGE(tt2) */ tt2.* from
| (select tt1.* from
| (select /*+ MERGE(t1) */ t1.* from
| t1 left join t2 on t1.id = t2.id) tt1
| left join t3 on tt1.id>t3.id) tt2
| left join t4 on tt2.id=t4.id
|""".stripMargin
sql(query).collect()
```
Before:
```
VeloxColumnarToRowExec (48)
+- ^ ProjectExecTransformer (46)
+- ^ SortMergeJoinExecTransformer LeftOuter (45)
:- ^ SortExecTransformer (33)
: +- ^ ProjectExecTransformer (32)
: +- ^ VeloxBroadcastNestedLoopJoinExecTransformer LeftOuter
(31)
: :- ^ ProjectExecTransformer (23)
: : +- ^ SortMergeJoinExecTransformer LeftOuter (22)
: : :- ^ SortExecTransformer (10)
: : : +- ^ InputIteratorTransformer (9)
: : : +- RowToVeloxColumnar (7)
: : : +- AQEShuffleRead (6)
: : : +- ShuffleQueryStage (5),
Statistics(sizeInBytes=160.0 B, rowCount=10)
: : : +- Exchange (4)
: : : +- VeloxColumnarToRowExec (3)
: : : +- ^ Scan parquet (1)
: : +- ^ SortExecTransformer (21)
: : +- ^ InputIteratorTransformer (20)
: : +- RowToVeloxColumnar (18)
: : +- AQEShuffleRead (17)
: : +- ShuffleQueryStage (16),
Statistics(sizeInBytes=160.0 B, rowCount=10)
: : +- Exchange (15)
: : +- VeloxColumnarToRowExec (14)
: : +- ^ FilterExecTransformer (12)
: : +- ^ Scan parquet (11)
: +- ^ InputIteratorTransformer (30)
: +- BroadcastQueryStage (28),
Statistics(sizeInBytes=844.0 B, rowCount=100)
: +- ColumnarBroadcastExchange (27)
: +- ^ FilterExecTransformer (25)
: +- ^ Scan parquet (24)
+- ^ SortExecTransformer (44)
+- ^ InputIteratorTransformer (43)
+- RowToVeloxColumnar (41)
+- AQEShuffleRead (40)
+- ShuffleQueryStage (39), Statistics(sizeInBytes=160.0
B, rowCount=10)
+- Exchange (38)
+- VeloxColumnarToRowExec (37)
+- ^ FilterExecTransformer (35)
+- ^ Scan parquet (34)
```
after:
```
VeloxColumnarToRowExec (47)
+- ^ ProjectExecTransformer (45)
+- ^ SortMergeJoinExecTransformer LeftOuter (44)
:- ^ ProjectExecTransformer (32)
: +- ^ VeloxBroadcastNestedLoopJoinExecTransformer LeftOuter (31)
: :- ^ ProjectExecTransformer (23)
: : +- ^ SortMergeJoinExecTransformer LeftOuter (22)
: : :- ^ SortExecTransformer (10)
: : : +- ^ InputIteratorTransformer (9)
: : : +- RowToVeloxColumnar (7)
: : : +- AQEShuffleRead (6)
: : : +- ShuffleQueryStage (5),
Statistics(sizeInBytes=160.0 B, rowCount=10)
: : : +- Exchange (4)
: : : +- VeloxColumnarToRowExec (3)
: : : +- ^ Scan parquet (1)
: : +- ^ SortExecTransformer (21)
: : +- ^ InputIteratorTransformer (20)
: : +- RowToVeloxColumnar (18)
: : +- AQEShuffleRead (17)
: : +- ShuffleQueryStage (16),
Statistics(sizeInBytes=160.0 B, rowCount=10)
: : +- Exchange (15)
: : +- VeloxColumnarToRowExec (14)
: : +- ^ FilterExecTransformer (12)
: : +- ^ Scan parquet (11)
: +- ^ InputIteratorTransformer (30)
: +- BroadcastQueryStage (28), Statistics(sizeInBytes=844.0
B, rowCount=100)
: +- ColumnarBroadcastExchange (27)
: +- ^ FilterExecTransformer (25)
: +- ^ Scan parquet (24)
+- ^ SortExecTransformer (43)
+- ^ InputIteratorTransformer (42)
+- RowToVeloxColumnar (40)
+- AQEShuffleRead (39)
+- ShuffleQueryStage (38), Statistics(sizeInBytes=160.0
B, rowCount=10)
+- Exchange (37)
+- VeloxColumnarToRowExec (36)
+- ^ FilterExecTransformer (34)
+- ^ Scan parquet (33)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]