lgbo-ustc commented on issue #7087:
URL:
https://github.com/apache/incubator-gluten/issues/7087#issuecomment-2325439641
The physical plan on spark-3.3 is
```
CHNativeColumnarToRow
+- TakeOrderedAndProjectExecTransformer (limit=100, orderBy=[rnk#5 ASC NULLS
FIRST], output=[rnk#5,best_performing#11,worst_performing#12])
+- ^(11) ProjectExecTransformer [rnk#5, i_product_name#66 AS
best_performing#11, i_product_name#111 AS worst_performing#12]
+- ^(11) CHBroadcastHashJoinExecTransformer [item_sk#6L],
[i_item_sk#90L], Inner, BuildRight, false
:- ^(11) ProjectExecTransformer [rnk#5, item_sk#6L,
i_product_name#66]
: +- ^(11) CHBroadcastHashJoinExecTransformer [item_sk#1L],
[i_item_sk#45L], Inner, BuildRight, false
: :- ^(11) ProjectExecTransformer [item_sk#1L, rnk#5,
item_sk#6L]
: : +- ^(11) CHSortMergeJoinExecTransformer [rnk#5], [rnk#10],
Inner, false
: : :- ^(11) SortExecTransformer [rnk#5 ASC NULLS FIRST],
false, 0
: : : +- ^(11) ProjectExecTransformer [item_sk#1L, rnk#5]
: : : +- ^(11) FilterExecTransformer ((rnk#5 < 11) AND
isnotnull(item_sk#1L))
: : : +- ^(11) WindowExecTransformer
[rank(rank_col#2) windowspecdefinition(rank_col#2 ASC NULLS FIRST,
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS
rnk#5], [rank_col#2 ASC NULLS FIRST]
: : : +- ^(11) SortExecTransformer [rank_col#2
ASC NULLS FIRST], false, 0
: : : +- ^(11)
InputIteratorTransformer[item_sk#1L, rank_col#2]
: : : +- ColumnarExchange SinglePartition,
ENSURE_REQUIREMENTS, [plan_id=896], [shuffle_writer_type=hash], [OUTPUT]
List(item_sk:LongType, rank_col:DecimalType(11,6))
: : : +- ^(6) FilterExecTransformer
(isnotnull(rank_col#2) AND (cast(rank_col#2 as decimal(13,7)) >
CheckOverflow((promote_precision(cast(0.9 as decimal(11,6))) *
promote_precision(Subquery scalar-subquery#4, [id=#222])), DecimalType(13,7))))
: : : : +- Subquery
scalar-subquery#4, [id=#222]
: : : : +- CHNativeColumnarToRow
: : : : +- ^(2)
ProjectExecTransformer [cast((avg(UnscaledValue(ss_net_profit#142))#116 /
100.0) as decimal(11,6)) AS rank_col#3]
: : : : +- ^(2)
HashAggregateTransformer(keys=[ss_store_sk#128L],
functions=[avg(UnscaledValue(ss_net_profit#142))], isStreamingAgg=false)
: : : : +- ^(2)
InputIteratorTransformer[ss_store_sk#128L, sum#204, count#205L]
: : : : +-
ColumnarExchange hashpartitioning(ss_store_sk#128L, 5), ENSURE_REQUIREMENTS,
[plan_id=215], [shuffle_writer_type=hash], [OUTPUT]
ArrayBuffer(ss_store_sk:LongType, sum:DoubleType, count:LongType)
: : : : +- ^(1)
HashAggregateTransformer(keys=[ss_store_sk#128L],
functions=[partial_avg(_pre_0#206L)], isStreamingAgg=false)
: : : : +- ^(1)
ProjectExecTransformer [ss_store_sk#128L, ss_net_profit#142,
UnscaledValue(ss_net_profit#142) AS _pre_0#206L]
: : : : +-
^(1) FilterExecTransformer ((isnotnull(ss_store_sk#128L) AND (ss_store_sk#128L
= cast(2 as bigint))) AND isnull(ss_hdemo_sk#126L))
: : : :
+- ^(1) NativeFileScan parquet
tpcds_pq100.store_sales[ss_hdemo_sk#126L,ss_store_sk#128L,ss_net_profit#142]
Batched: true, DataFilters: [isnotnull(ss_store_sk#128L),
isnull(ss_hdemo_sk#126L)], Format: Parquet, Location: InMemoryFileIndex(1
paths)[file:/data3/liangjiabiao/docker/local_gluten/spark-3.3.2-bin-hadoop3/s...,
PartitionFilters: [], PushedFilters: [IsNotNull(ss_store_sk),
IsNull(ss_hdemo_sk)], ReadSchema:
struct<ss_hdemo_sk:bigint,ss_store_sk:bigint,ss_net_profit:decimal(7,2)>
: : : +- ^(6) ProjectExecTransformer
[ss_item_sk#22L AS item_sk#1L, cast((avg(UnscaledValue(ss_net_profit#44))#114 /
100.0) as decimal(11,6)) AS rank_col#2]
: : : +- ^(6)
HashAggregateTransformer(keys=[ss_item_sk#22L],
functions=[avg(UnscaledValue(ss_net_profit#44))], isStreamingAgg=false)
: : : +- ^(6)
InputIteratorTransformer[ss_item_sk#22L, sum#196, count#197L]
: : : +- ColumnarExchange
hashpartitioning(ss_item_sk#22L, 5), ENSURE_REQUIREMENTS, [plan_id=889],
[shuffle_writer_type=hash], [OUTPUT] ArrayBuffer(ss_item_sk:LongType,
sum:DoubleType, count:LongType)
: : : +- ^(5)
HashAggregateTransformer(keys=[ss_item_sk#22L],
functions=[partial_avg(_pre_2#216L)], isStreamingAgg=false)
: : : +- ^(5)
ProjectExecTransformer [ss_item_sk#22L, ss_net_profit#44,
UnscaledValue(ss_net_profit#44) AS _pre_2#216L]
: : : +- ^(5)
FilterExecTransformer (isnotnull(ss_store_sk#30L) AND (ss_store_sk#30L = cast(2
as bigint)))
: : : +- ^(5)
NativeFileScan parquet
tpcds_pq100.store_sales[ss_item_sk#22L,ss_store_sk#30L,ss_net_profit#44]
Batched: true, DataFilters: [isnotnull(ss_store_sk#30L)], Format: Parquet,
Location: InMemoryFileIndex(1
paths)[file:/data3/liangjiabiao/docker/local_gluten/spark-3.3.2-bin-hadoop3/s...,
PartitionFilters: [], PushedFilters: [IsNotNull(ss_store_sk)], ReadSchema:
struct<ss_item_sk:bigint,ss_store_sk:bigint,ss_net_profit:decimal(7,2)>
: : +- ^(11) SortExecTransformer [rnk#10 ASC NULLS FIRST],
false, 0
: : +- ^(11) ProjectExecTransformer [item_sk#6L, rnk#10]
: : +- ^(11) FilterExecTransformer ((rnk#10 < 11) AND
isnotnull(item_sk#6L))
: : +- ^(11) WindowExecTransformer
[rank(rank_col#7) windowspecdefinition(rank_col#7 DESC NULLS LAST,
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS
rnk#10], [rank_col#7 DESC NULLS LAST]
: : +- ^(11) SortExecTransformer [rank_col#7
DESC NULLS LAST], false, 0
: : +- ^(11)
InputIteratorTransformer[item_sk#6L, rank_col#7]
: : +- ReusedExchange [item_sk#6L,
rank_col#7], ColumnarExchange SinglePartition, ENSURE_REQUIREMENTS,
[plan_id=896], [shuffle_writer_type=hash], [OUTPUT] List(item_sk:LongType,
rank_col:DecimalType(11,6))
: +- ^(11) InputIteratorTransformer[i_item_sk#45L,
i_product_name#66]
: +- ColumnarBroadcastExchange
HashedRelationBroadcastMode(List(input[0, bigint, false]),false), [plan_id=923]
: +- ^(9) FilterExecTransformer isnotnull(i_item_sk#45L)
: +- ^(9) NativeFileScan parquet
tpcds_pq100.item[i_item_sk#45L,i_product_name#66] Batched: true, DataFilters:
[isnotnull(i_item_sk#45L)], Format: Parquet, Location: InMemoryFileIndex(1
paths)[file:/data3/liangjiabiao/docker/local_gluten/spark-3.3.2-bin-hadoop3/s...,
PartitionFilters: [], PushedFilters: [IsNotNull(i_item_sk)], ReadSchema:
struct<i_item_sk:bigint,i_product_name:string>
+- ^(11) InputIteratorTransformer[i_item_sk#90L, i_product_name#111]
+- ReusedExchange [i_item_sk#90L, i_product_name#111],
ColumnarBroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint,
false]),false), [plan_id=923]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]