[ https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888142#action_12888142 ]
Nicolas Spiegelberg commented on HBASE-2794: -------------------------------------------- Talked with Kris about setting proper exit conditions. #1 : Exit if our error.rate > 10%. This is an arbitrary number. Could easily make this configurable if someone needs it #2 : Exit if it would take > 1ms to run the bloom check. This ensures that blooms are beneficial for performance even if they aren't needed 90% of the time I wonder if it would be good to give the user an option of not running a bloom check if only 1 HFile in the StoreFile, but that's for another JIRA. > ROWCOL bloom filter not used if multiple columns within same family are > requested in a Get > ------------------------------------------------------------------------------------------ > > Key: HBASE-2794 > URL: https://issues.apache.org/jira/browse/HBASE-2794 > Project: HBase > Issue Type: Improvement > Reporter: Kannan Muthukkaruppan > > Noticed the following snippet in StoreFile.java:Scanner:shouldSeek(): > {code} > switch(bloomFilterType) { > case ROW: > key = row; > break; > case ROWCOL: > if (columns.size() == 1) { > byte[] col = columns.first(); > key = Bytes.add(row, col); > break; > } > //$FALL-THROUGH$ > default: > return true; > } > {code} > If columns.size > 1, then we currently don't take advantage of the bloom > filter. We should optimize this to check bloom for each of columns and if > none of the columns are present in the bloom avoid opening the file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.