Wenzhe Zhou created IMPALA-9789:
-----------------------------------
Summary: Disable ineffective bloom filters for Kudu scan
Key: IMPALA-9789
URL: https://issues.apache.org/jira/browse/IMPALA-9789
Project: IMPALA
Issue Type: Improvement
Components: Backend, Frontend
Affects Versions: Impala 3.4.0
Reporter: Wenzhe Zhou
Assignee: Wenzhe Zhou
Fix For: Impala 4.0
In bloom-filter benchmark for Kudu, there is performance regression for query
TPCH-Q9. In Profile shows that 5 bloom filters are generated by hash join. Some
of those filters are not useful for filtering rows. When pushing all bloom
filters to Kudu, the bloom filter evaluations add extra cost for Kudu scan,
which cause performance regression.
The regression on Q9 looks a lot like
https://issues.apache.org/jira/browse/IMPALA-9302, where Q9 regressed a lot
with multithreading initially because ineffective filters weren't being
disabled. This query is a bit special in that there are many filters pushed to
scan 2, and most of them are not useful. Based on our experience there, we need
to add a method to disable ineffective filters for Kudu scan.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]