comphead commented on issue #2667:
URL: 
https://github.com/apache/datafusion-comet/issues/2667#issuecomment-3481588937

   Spark Plan 
   
   ```
   == Physical Plan ==
   AdaptiveSparkPlan (55)
   +- Sort (54)
      +- Exchange (53)
         +- HashAggregate (52)
            +- Exchange (51)
               +- HashAggregate (50)
                  +- Project (49)
                     +- BroadcastHashJoin Inner BuildRight (48)
                        :- Project (44)
                        :  +- BroadcastHashJoin Inner BuildRight (43)
                        :     :- Project (38)
                        :     :  +- SortMergeJoin LeftAnti (37)
                        :     :     :- SortMergeJoin LeftAnti (26)
                        :     :     :  :- SortMergeJoin LeftSemi (15)
                        :     :     :  :  :- Sort (4)
                        :     :     :  :  :  +- Exchange (3)
                        :     :     :  :  :     +- Filter (2)
                        :     :     :  :  :        +- Scan parquet  (1)
                        :     :     :  :  +- Sort (14)
                        :     :     :  :     +- Exchange (13)
                        :     :     :  :        +- Project (12)
                        :     :     :  :           +- BroadcastHashJoin Inner 
BuildRight (11)
                        :     :     :  :              :- Filter (6)
                        :     :     :  :              :  +- Scan parquet  (5)
                        :     :     :  :              +- BroadcastExchange (10)
                        :     :     :  :                 +- Project (9)
                        :     :     :  :                    +- Filter (8)
                        :     :     :  :                       +- Scan parquet  
(7)
                        :     :     :  +- Sort (25)
                        :     :     :     +- Exchange (24)
                        :     :     :        +- Project (23)
                        :     :     :           +- BroadcastHashJoin Inner 
BuildRight (22)
                        :     :     :              :- Filter (17)
                        :     :     :              :  +- Scan parquet  (16)
                        :     :     :              +- BroadcastExchange (21)
                        :     :     :                 +- Project (20)
                        :     :     :                    +- Filter (19)
                        :     :     :                       +- Scan parquet  
(18)
                        :     :     +- Sort (36)
                        :     :        +- Exchange (35)
                        :     :           +- Project (34)
                        :     :              +- BroadcastHashJoin Inner 
BuildRight (33)
                        :     :                 :- Filter (28)
                        :     :                 :  +- Scan parquet  (27)
                        :     :                 +- BroadcastExchange (32)
                        :     :                    +- Project (31)
                        :     :                       +- Filter (30)
                        :     :                          +- Scan parquet  (29)
                        :     +- BroadcastExchange (42)
                        :        +- Project (41)
                        :           +- Filter (40)
                        :              +- Scan parquet  (39)
                        +- BroadcastExchange (47)
                           +- Filter (46)
                              +- Scan parquet  (45)
   
   
   (1) Scan parquet 
   Output [3]: [c_customer_sk#202L, c_current_cdemo_sk#204, 
c_current_addr_sk#206L]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/customer.parquet]
   PushedFilters: [IsNotNull(c_current_addr_sk), IsNotNull(c_current_cdemo_sk)]
   ReadSchema: 
struct<c_customer_sk:bigint,c_current_cdemo_sk:double,c_current_addr_sk:bigint>
   
   (2) Filter
   Input [3]: [c_customer_sk#202L, c_current_cdemo_sk#204, 
c_current_addr_sk#206L]
   Condition : ((isnotnull(c_current_addr_sk#206L) AND 
isnotnull(c_current_cdemo_sk#204)) AND might_contain(Subquery subquery#925, 
[id=#222], xxhash64(c_current_addr_sk#206L, 42)))
   
   (3) Exchange
   Input [3]: [c_customer_sk#202L, c_current_cdemo_sk#204, 
c_current_addr_sk#206L]
   Arguments: 
hashpartitioning(knownfloatingpointnormalized(normalizenanandzero(cast(c_customer_sk#202L
 as double))), 200), ENSURE_REQUIREMENTS, [plan_id=259]
   
   (4) Sort
   Input [3]: [c_customer_sk#202L, c_current_cdemo_sk#204, 
c_current_addr_sk#206L]
   Arguments: 
[knownfloatingpointnormalized(normalizenanandzero(cast(c_customer_sk#202L as 
double))) ASC NULLS FIRST], false, 0
   
   (5) Scan parquet 
   Output [2]: [ss_sold_date_sk#580, ss_customer_sk#583]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/store_sales.parquet]
   PushedFilters: [IsNotNull(ss_sold_date_sk)]
   ReadSchema: struct<ss_sold_date_sk:double,ss_customer_sk:double>
   
   (6) Filter
   Input [2]: [ss_sold_date_sk#580, ss_customer_sk#583]
   Condition : isnotnull(ss_sold_date_sk#580)
   
   (7) Scan parquet 
   Output [3]: [d_date_sk#282L, d_year#288L, d_moy#290L]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/date_dim.parquet]
   PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,2002), 
GreaterThanOrEqual(d_moy,2), LessThanOrEqual(d_moy,4), IsNotNull(d_date_sk)]
   ReadSchema: struct<d_date_sk:bigint,d_year:bigint,d_moy:bigint>
   
   (8) Filter
   Input [3]: [d_date_sk#282L, d_year#288L, d_moy#290L]
   Condition : (((((isnotnull(d_year#288L) AND isnotnull(d_moy#290L)) AND 
(d_year#288L = 2002)) AND (d_moy#290L >= 2)) AND (d_moy#290L <= 4)) AND 
isnotnull(d_date_sk#282L))
   
   (9) Project
   Output [1]: [d_date_sk#282L]
   Input [3]: [d_date_sk#282L, d_year#288L, d_moy#290L]
   
   (10) BroadcastExchange
   Input [1]: [d_date_sk#282L]
   Arguments: 
HashedRelationBroadcastMode(List(knownfloatingpointnormalized(normalizenanandzero(cast(input[0,
 bigint, true] as double)))),false), [plan_id=254]
   
   (11) BroadcastHashJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(ss_sold_date_sk#580))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(d_date_sk#282L as 
double)))]
   Join type: Inner
   Join condition: None
   
   (12) Project
   Output [1]: [ss_customer_sk#583]
   Input [3]: [ss_sold_date_sk#580, ss_customer_sk#583, d_date_sk#282L]
   
   (13) Exchange
   Input [1]: [ss_customer_sk#583]
   Arguments: 
hashpartitioning(knownfloatingpointnormalized(normalizenanandzero(ss_customer_sk#583)),
 200), ENSURE_REQUIREMENTS, [plan_id=260]
   
   (14) Sort
   Input [1]: [ss_customer_sk#583]
   Arguments: 
[knownfloatingpointnormalized(normalizenanandzero(ss_customer_sk#583)) ASC 
NULLS FIRST], false, 0
   
   (15) SortMergeJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(c_customer_sk#202L as 
double)))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(ss_customer_sk#583))]
   Join type: LeftSemi
   Join condition: None
   
   (16) Scan parquet 
   Output [2]: [ws_sold_date_sk#730, ws_bill_customer_sk#734]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/web_sales.parquet]
   PushedFilters: [IsNotNull(ws_sold_date_sk)]
   ReadSchema: struct<ws_sold_date_sk:double,ws_bill_customer_sk:double>
   
   (17) Filter
   Input [2]: [ws_sold_date_sk#730, ws_bill_customer_sk#734]
   Condition : isnotnull(ws_sold_date_sk#730)
   
   (18) Scan parquet 
   Output [3]: [d_date_sk#859L, d_year#865L, d_moy#867L]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/date_dim.parquet]
   PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,2002), 
GreaterThanOrEqual(d_moy,2), LessThanOrEqual(d_moy,4), IsNotNull(d_date_sk)]
   ReadSchema: struct<d_date_sk:bigint,d_year:bigint,d_moy:bigint>
   
   (19) Filter
   Input [3]: [d_date_sk#859L, d_year#865L, d_moy#867L]
   Condition : (((((isnotnull(d_year#865L) AND isnotnull(d_moy#867L)) AND 
(d_year#865L = 2002)) AND (d_moy#867L >= 2)) AND (d_moy#867L <= 4)) AND 
isnotnull(d_date_sk#859L))
   
   (20) Project
   Output [1]: [d_date_sk#859L]
   Input [3]: [d_date_sk#859L, d_year#865L, d_moy#867L]
   
   (21) BroadcastExchange
   Input [1]: [d_date_sk#859L]
   Arguments: 
HashedRelationBroadcastMode(List(knownfloatingpointnormalized(normalizenanandzero(cast(input[0,
 bigint, true] as double)))),false), [plan_id=264]
   
   (22) BroadcastHashJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(ws_sold_date_sk#730))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(d_date_sk#859L as 
double)))]
   Join type: Inner
   Join condition: None
   
   (23) Project
   Output [1]: [ws_bill_customer_sk#734]
   Input [3]: [ws_sold_date_sk#730, ws_bill_customer_sk#734, d_date_sk#859L]
   
   (24) Exchange
   Input [1]: [ws_bill_customer_sk#734]
   Arguments: 
hashpartitioning(knownfloatingpointnormalized(normalizenanandzero(ws_bill_customer_sk#734)),
 200), ENSURE_REQUIREMENTS, [plan_id=269]
   
   (25) Sort
   Input [1]: [ws_bill_customer_sk#734]
   Arguments: 
[knownfloatingpointnormalized(normalizenanandzero(ws_bill_customer_sk#734)) ASC 
NULLS FIRST], false, 0
   
   (26) SortMergeJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(c_customer_sk#202L as 
double)))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(ws_bill_customer_sk#734))]
   Join type: LeftAnti
   Join condition: None
   
   (27) Scan parquet 
   Output [2]: [cs_sold_date_sk#134, cs_ship_customer_sk#141]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/catalog_sales.parquet]
   PushedFilters: [IsNotNull(cs_sold_date_sk)]
   ReadSchema: struct<cs_sold_date_sk:double,cs_ship_customer_sk:double>
   
   (28) Filter
   Input [2]: [cs_sold_date_sk#134, cs_ship_customer_sk#141]
   Condition : isnotnull(cs_sold_date_sk#134)
   
   (29) Scan parquet 
   Output [3]: [d_date_sk#887L, d_year#893L, d_moy#895L]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/date_dim.parquet]
   PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,2002), 
GreaterThanOrEqual(d_moy,2), LessThanOrEqual(d_moy,4), IsNotNull(d_date_sk)]
   ReadSchema: struct<d_date_sk:bigint,d_year:bigint,d_moy:bigint>
   
   (30) Filter
   Input [3]: [d_date_sk#887L, d_year#893L, d_moy#895L]
   Condition : (((((isnotnull(d_year#893L) AND isnotnull(d_moy#895L)) AND 
(d_year#893L = 2002)) AND (d_moy#895L >= 2)) AND (d_moy#895L <= 4)) AND 
isnotnull(d_date_sk#887L))
   
   (31) Project
   Output [1]: [d_date_sk#887L]
   Input [3]: [d_date_sk#887L, d_year#893L, d_moy#895L]
   
   (32) BroadcastExchange
   Input [1]: [d_date_sk#887L]
   Arguments: 
HashedRelationBroadcastMode(List(knownfloatingpointnormalized(normalizenanandzero(cast(input[0,
 bigint, true] as double)))),false), [plan_id=272]
   
   (33) BroadcastHashJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cs_sold_date_sk#134))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(d_date_sk#887L as 
double)))]
   Join type: Inner
   Join condition: None
   
   (34) Project
   Output [1]: [cs_ship_customer_sk#141]
   Input [3]: [cs_sold_date_sk#134, cs_ship_customer_sk#141, d_date_sk#887L]
   
   (35) Exchange
   Input [1]: [cs_ship_customer_sk#141]
   Arguments: 
hashpartitioning(knownfloatingpointnormalized(normalizenanandzero(cs_ship_customer_sk#141)),
 200), ENSURE_REQUIREMENTS, [plan_id=277]
   
   (36) Sort
   Input [1]: [cs_ship_customer_sk#141]
   Arguments: 
[knownfloatingpointnormalized(normalizenanandzero(cs_ship_customer_sk#141)) ASC 
NULLS FIRST], false, 0
   
   (37) SortMergeJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(c_customer_sk#202L as 
double)))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cs_ship_customer_sk#141))]
   Join type: LeftAnti
   Join condition: None
   
   (38) Project
   Output [2]: [c_current_cdemo_sk#204, c_current_addr_sk#206L]
   Input [3]: [c_customer_sk#202L, c_current_cdemo_sk#204, 
c_current_addr_sk#206L]
   
   (39) Scan parquet 
   Output [2]: [ca_address_sk#238L, ca_state#246]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/customer_address.parquet]
   PushedFilters: [In(ca_state, [IN,MS,VA]), IsNotNull(ca_address_sk)]
   ReadSchema: struct<ca_address_sk:bigint,ca_state:string>
   
   (40) Filter
   Input [2]: [ca_address_sk#238L, ca_state#246]
   Condition : (ca_state#246 IN (IN,VA,MS) AND isnotnull(ca_address_sk#238L))
   
   (41) Project
   Output [1]: [ca_address_sk#238L]
   Input [2]: [ca_address_sk#238L, ca_state#246]
   
   (42) BroadcastExchange
   Input [1]: [ca_address_sk#238L]
   Arguments: HashedRelationBroadcastMode(List(input[0, bigint, true]),false), 
[plan_id=282]
   
   (43) BroadcastHashJoin
   Left keys [1]: [c_current_addr_sk#206L]
   Right keys [1]: [ca_address_sk#238L]
   Join type: Inner
   Join condition: None
   
   (44) Project
   Output [1]: [c_current_cdemo_sk#204]
   Input [3]: [c_current_cdemo_sk#204, c_current_addr_sk#206L, 
ca_address_sk#238L]
   
   (45) Scan parquet 
   Output [6]: [cd_demo_sk#264L, cd_gender#265, cd_marital_status#266, 
cd_education_status#267, cd_purchase_estimate#268L, cd_credit_rating#269]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/customer_demographics.parquet]
   PushedFilters: [IsNotNull(cd_demo_sk)]
   ReadSchema: 
struct<cd_demo_sk:bigint,cd_gender:string,cd_marital_status:string,cd_education_status:string,cd_purchase_estimate:bigint,cd_credit_rating:string>
   
   (46) Filter
   Input [6]: [cd_demo_sk#264L, cd_gender#265, cd_marital_status#266, 
cd_education_status#267, cd_purchase_estimate#268L, cd_credit_rating#269]
   Condition : isnotnull(cd_demo_sk#264L)
   
   (47) BroadcastExchange
   Input [6]: [cd_demo_sk#264L, cd_gender#265, cd_marital_status#266, 
cd_education_status#267, cd_purchase_estimate#268L, cd_credit_rating#269]
   Arguments: 
HashedRelationBroadcastMode(List(knownfloatingpointnormalized(normalizenanandzero(cast(input[0,
 bigint, false] as double)))),false), [plan_id=286]
   
   (48) BroadcastHashJoin
   Left keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(c_current_cdemo_sk#204))]
   Right keys [1]: 
[knownfloatingpointnormalized(normalizenanandzero(cast(cd_demo_sk#264L as 
double)))]
   Join type: Inner
   Join condition: None
   
   (49) Project
   Output [5]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269]
   Input [7]: [c_current_cdemo_sk#204, cd_demo_sk#264L, cd_gender#265, 
cd_marital_status#266, cd_education_status#267, cd_purchase_estimate#268L, 
cd_credit_rating#269]
   
   (50) HashAggregate
   Input [5]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269]
   Keys [5]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269]
   Functions [1]: [partial_count(1)]
   Aggregate Attributes [1]: [count#926L]
   Results [6]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269, count#927L]
   
   (51) Exchange
   Input [6]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269, count#927L]
   Arguments: hashpartitioning(cd_gender#265, cd_marital_status#266, 
cd_education_status#267, cd_purchase_estimate#268L, cd_credit_rating#269, 200), 
ENSURE_REQUIREMENTS, [plan_id=291]
   
   (52) HashAggregate
   Input [6]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269, count#927L]
   Keys [5]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cd_purchase_estimate#268L, cd_credit_rating#269]
   Functions [1]: [count(1)]
   Aggregate Attributes [1]: [count(1)#856L]
   Results [8]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
count(1)#856L AS cnt1#850L, cd_purchase_estimate#268L, count(1)#856L AS 
cnt2#851L, cd_credit_rating#269, count(1)#856L AS cnt3#852L]
   
   (53) Exchange
   Input [8]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cnt1#850L, cd_purchase_estimate#268L, cnt2#851L, cd_credit_rating#269, 
cnt3#852L]
   Arguments: rangepartitioning(cd_gender#265 ASC NULLS FIRST, 
cd_marital_status#266 ASC NULLS FIRST, cd_education_status#267 ASC NULLS FIRST, 
cd_purchase_estimate#268L ASC NULLS FIRST, cd_credit_rating#269 ASC NULLS 
FIRST, 200), ENSURE_REQUIREMENTS, [plan_id=294]
   
   (54) Sort
   Input [8]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cnt1#850L, cd_purchase_estimate#268L, cnt2#851L, cd_credit_rating#269, 
cnt3#852L]
   Arguments: [cd_gender#265 ASC NULLS FIRST, cd_marital_status#266 ASC NULLS 
FIRST, cd_education_status#267 ASC NULLS FIRST, cd_purchase_estimate#268L ASC 
NULLS FIRST, cd_credit_rating#269 ASC NULLS FIRST], true, 0
   
   (55) AdaptiveSparkPlan
   Output [8]: [cd_gender#265, cd_marital_status#266, cd_education_status#267, 
cnt1#850L, cd_purchase_estimate#268L, cnt2#851L, cd_credit_rating#269, 
cnt3#852L]
   Arguments: isFinalPlan=false
   
   ===== Subqueries =====
   
   Subquery:1 Hosting operator id = 2 Hosting Expression = Subquery 
subquery#925, [id=#222]
   AdaptiveSparkPlan (62)
   +- ObjectHashAggregate (61)
      +- Exchange (60)
         +- ObjectHashAggregate (59)
            +- Project (58)
               +- Filter (57)
                  +- Scan parquet  (56)
   
   
   (56) Scan parquet 
   Output [2]: [ca_address_sk#238L, ca_state#246]
   Batched: true
   Location: InMemoryFileIndex 
[file:/Users/ovoievodin/dev/prj/apple/ovoievodin/rust/datafusion-benchmarks/tpcds/data_parquet/customer_address.parquet]
   PushedFilters: [In(ca_state, [IN,MS,VA]), IsNotNull(ca_address_sk)]
   ReadSchema: struct<ca_address_sk:bigint,ca_state:string>
   
   (57) Filter
   Input [2]: [ca_address_sk#238L, ca_state#246]
   Condition : (ca_state#246 IN (IN,VA,MS) AND isnotnull(ca_address_sk#238L))
   
   (58) Project
   Output [1]: [ca_address_sk#238L]
   Input [2]: [ca_address_sk#238L, ca_state#246]
   
   (59) ObjectHashAggregate
   Input [1]: [ca_address_sk#238L]
   Keys: []
   Functions [1]: [partial_bloom_filter_agg(xxhash64(ca_address_sk#238L, 42), 
1000000, 8388608, 0, 0)]
   Aggregate Attributes [1]: [buf#928]
   Results [1]: [buf#929]
   
   (60) Exchange
   Input [1]: [buf#929]
   Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=220]
   
   (61) ObjectHashAggregate
   Input [1]: [buf#929]
   Keys: []
   Functions [1]: [bloom_filter_agg(xxhash64(ca_address_sk#238L, 42), 1000000, 
8388608, 0, 0)]
   Aggregate Attributes [1]: [bloom_filter_agg(xxhash64(ca_address_sk#238L, 
42), 1000000, 8388608, 0, 0)#923]
   Results [1]: [bloom_filter_agg(xxhash64(ca_address_sk#238L, 42), 1000000, 
8388608, 0, 0)#923 AS bloomFilter#924]
   
   (62) AdaptiveSparkPlan
   Output [1]: [bloomFilter#924]
   Arguments: isFinalPlan=false
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to