Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17387 )
Change subject: IMPALA-10681: [WIP] Fix join cardinality if one side is scalar ...................................................................... Patch Set 1: Thanks Aman for working on this. I've found the following queries where the cardinality for LEFT SEMI JOIN and INNER JOIN still differs. This kind of query is especially important for us: explain select * from store_sales inner join (select max(s_store_sk) as max_store_sk from store union select min(s_store_sk) as max_store_sk from store) v on ss_store_sk = max_store_sk; In the above query LHS NDV is 6 while RHS cardinality is 2, therefore join output cardinality should be LHS CARD / 3, just like LEFT SEMI JOIN calculates. The planner also calculates wrong cardinalities for this: explain select * from store_sales inner join (select max(s_store_sk) as max_store_sk from store group by s_market_id limit 3) v on ss_store_sk = max_store_sk; -- To view, visit http://gerrit.cloudera.org:8080/17387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc Gerrit-Change-Number: 17387 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Mon, 03 May 2021 10:37:16 +0000 Gerrit-HasComments: No