[ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888142#action_12888142
 ] 

Nicolas Spiegelberg commented on HBASE-2794:
--------------------------------------------

Talked with Kris about setting proper exit conditions.

#1 : Exit if our error.rate > 10%.  This is an arbitrary number.  Could easily 
make this configurable if someone needs it
#2 : Exit if it would take > 1ms to run the bloom check.  This ensures that 
blooms are beneficial for performance even if they aren't needed 90% of the time

I wonder if it would be good to give the user an option of not running a bloom 
check if only 1 HFile in the StoreFile, but that's for another JIRA.

> ROWCOL bloom filter not used if multiple columns within same family are 
> requested in a Get
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2794
>                 URL: https://issues.apache.org/jira/browse/HBASE-2794
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>
> Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
> {code}
>         switch(bloomFilterType) {
>           case ROW:
>             key = row;
>             break;
>           case ROWCOL:
>             if (columns.size() == 1) {
>               byte[] col = columns.first();
>               key = Bytes.add(row, col);
>               break;
>             }
>             //$FALL-THROUGH$
>           default:
>             return true;
>         }
> {code}
> If columns.size > 1, then we currently don't take advantage of the bloom 
> filter.  We should optimize this to check bloom for each of columns and if 
> none of the columns are present in the bloom avoid opening the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to