Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16622 )
Change subject: IMPALA-10252: fix invalid runtime filters for outer joins ...................................................................... Patch Set 6: Code-Review+1 > Patch Set 5: > > Updated the commit message as requested. > > Shant, I think hasNullRejectingConjucts (sp) in Analyzer.java handles at > least this case correctly - it does call isTrueWithNullSlots() on the > expression. I guess it's possible that it might handle more complex > expressions incorrectly, e.g. if the expression has slots from both sides of > the join and is false when all slots are null but true if a subset of slots > is null. > > > > [localhost.EXAMPLE.COM:21050] default> set > ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION=1; > ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION set to 1 > [localhost.EXAMPLE.COM:21050] default> explain select * from > functional.alltypes t1 left outer join functional.alltypestiny t2 on t1.id = > t2.id where zeroifnull(t2.int_col) = 0; > Query: explain select * from functional.alltypes t1 left outer join > functional.alltypestiny t2 on t1.id = t2.id where zeroifnull(t2.int_col) = 0 > +------------------------------------------------------------+ > | Explain String | > +------------------------------------------------------------+ > | Max Per-Host Resource Reservation: Memory=1.98MB Threads=5 | > | Per-Host Resource Estimates: Memory=163MB | > | Codegen disabled by planner | > | | > | PLAN-ROOT SINK | > | | | > | 04:EXCHANGE [UNPARTITIONED] | > | | | > | 02:HASH JOIN [LEFT OUTER JOIN, BROADCAST] | > | | hash predicates: t1.id = t2.id | > | | other predicates: zeroifnull(t2.int_col) = 0 | > | | row-size=178B cardinality=7.30K | > | | | > | |--03:EXCHANGE [BROADCAST] | > | | | | > | | 01:SCAN HDFS [functional.alltypestiny t2] | > | | HDFS partitions=4/4 files=4 size=460B | > | | row-size=89B cardinality=8 | > | | | > | 00:SCAN HDFS [functional.alltypes t1] | > | HDFS partitions=24/24 files=24 size=478.45KB | > | row-size=89B cardinality=7.30K | > +------------------------------------------------------------+ > Fetched 22 row(s) in 0.05s > [localhost.EXAMPLE.COM:21050] default> explain select * from > functional.alltypes t1 left outer join functional.alltypestiny t2 on t1.id = > t2.id where t2.int_col = 0; > Query: explain select * from functional.alltypes t1 left outer join > functional.alltypestiny t2 on t1.id = t2.id where t2.int_col = 0 > +------------------------------------------------------------+ > | Explain String | > +------------------------------------------------------------+ > | Max Per-Host Resource Reservation: Memory=2.98MB Threads=5 | > | Per-Host Resource Estimates: Memory=163MB | > | Codegen disabled by planner | > | | > | PLAN-ROOT SINK | > | | | > | 04:EXCHANGE [UNPARTITIONED] | > | | | > | 02:HASH JOIN [INNER JOIN, BROADCAST] | > | | hash predicates: t1.id = t2.id | > | | runtime filters: RF000 <- t2.id | > | | row-size=178B cardinality=4 | > | | | > | |--03:EXCHANGE [BROADCAST] | > | | | | > | | 01:SCAN HDFS [functional.alltypestiny t2] | > | | HDFS partitions=4/4 files=4 size=460B | > | | predicates: t2.int_col = 0 | > | | row-size=89B cardinality=4 | > | | | > | 00:SCAN HDFS [functional.alltypes t1] | > | HDFS partitions=24/24 files=24 size=478.45KB | > | runtime filters: RF000 -> t1.id | > | row-size=89B cardinality=7.30K | > +------------------------------------------------------------+ > Fetched 24 row(s) in 0.05s Yeah exactly I was thinking something like: t1 join t2 join t3 join t3 with a coalesce(t1, t2, t3, t4) I'll try to mock up an example, but it isn't relevant for this fix. Nice job! -- To view, visit http://gerrit.cloudera.org:8080/16622 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I507af1cc8df15bca21e0d8555019997812087261 Gerrit-Change-Number: 16622 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 07 Dec 2020 19:48:11 +0000 Gerrit-HasComments: No
