comphead opened a new issue, #18487:
URL: https://github.com/apache/datafusion/issues/18487

   ### Is your feature request related to a problem or challenge?
   
   I'm testing TPCDS Q69 correctness issues with 
Comet(https://github.com/apache/datafusion-comet/issues/2667) and found the Q69 
which contains multiple anti joins performs way worse comparing to hash join
   
   
   Modified query
   
   ```
   select
     cd_gender,
     cd_marital_status,
     cd_education_status,
     count(*) cnt1,
     cd_purchase_estimate,
     count(*) cnt2,
     cd_credit_rating,
     count(*) cnt3
    from
     customer c,customer_address ca,customer_demographics
    where
     c.c_current_addr_sk = ca.ca_address_sk and
     ca_state in ('IN','VA','MS') and
     cd_demo_sk = c.c_current_cdemo_sk and
     exists (select *
             from store_sales,date_dim
             where c.c_customer_sk = ss_customer_sk and
                   ss_sold_date_sk = d_date_sk and
                   d_year = 2002 and
                   d_moy between 2 and 2+2)   and (not exists (select *
               from web_sales,date_dim
               where c.c_customer_sk = ws_bill_customer_sk and
                     ws_sold_date_sk = d_date_sk and
                     d_year = 2002 and
                     d_moy between 2 and 2+2))
    group by cd_gender,
             cd_marital_status,
             cd_education_status,
             cd_purchase_estimate, cd_credit_rating order by 1, 2, 3, 4, 5, 6, 
7, 8;
   ```
   
   
   HJ Elapsed 9.879 seconds.
   SMJ Elapsed 59.865 seconds.
   
   Also it is looks like to be a reson for 
https://github.com/apache/datafusion-comet/issues/901
   
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to