Ankit Singhal created PHOENIX-6710:
--------------------------------------
Summary: Revert PHOENIX-3842 Turn on back default bloomFilter for
Phoenix Tables
Key: PHOENIX-6710
URL: https://issues.apache.org/jira/browse/PHOENIX-6710
Project: Phoenix
Issue Type: Bug
Components: core
Affects Versions: 4.11.0
Reporter: Ankit Singhal
Assignee: Ankit Singhal
It looks like PHOENIX-3842 was done to workaround PHOENIX-3797 in order to
unblock a release, and it was assumed that Phoenix is not used for GETs.
At one of our users, we saw that they have been doing heavy GETs in their
custom coprocessor to check if the key is present or not in the current. At
most 99% of the time, the key is not expected to be present as the load initial
and keys are expected to be random, but there is still some chance that there
is 1% of keys would be duplicated. But in the absence of BloomFilter, HBase has
to seek HFile to confirm if the key is not present, which results in regression
in performance for about 2x slower.
Even in use cases like Index maintenance and "ON DUPLICATE KEY" queries will
also be impacted without bloom filters.
As Phoenix is still used for GETs by the users. and we also have constructs
that intrinsically do GETs like Index maintenance and others. So I believe it
is always better to have a bloom filter should "ON" by default as I don't see
any implication of it getting on even if it is not getting used.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)