[
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888142#action_12888142
]
Nicolas Spiegelberg commented on HBASE-2794:
--------------------------------------------
Talked with Kris about setting proper exit conditions.
#1 : Exit if our error.rate > 10%. This is an arbitrary number. Could easily
make this configurable if someone needs it
#2 : Exit if it would take > 1ms to run the bloom check. This ensures that
blooms are beneficial for performance even if they aren't needed 90% of the time
I wonder if it would be good to give the user an option of not running a bloom
check if only 1 HFile in the StoreFile, but that's for another JIRA.
> ROWCOL bloom filter not used if multiple columns within same family are
> requested in a Get
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-2794
> URL: https://issues.apache.org/jira/browse/HBASE-2794
> Project: HBase
> Issue Type: Improvement
> Reporter: Kannan Muthukkaruppan
>
> Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
> {code}
> switch(bloomFilterType) {
> case ROW:
> key = row;
> break;
> case ROWCOL:
> if (columns.size() == 1) {
> byte[] col = columns.first();
> key = Bytes.add(row, col);
> break;
> }
> //$FALL-THROUGH$
> default:
> return true;
> }
> {code}
> If columns.size > 1, then we currently don't take advantage of the bloom
> filter. We should optimize this to check bloom for each of columns and if
> none of the columns are present in the bloom avoid opening the file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.