[ https://issues.apache.org/jira/browse/HBASE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikhail Antonov resolved HBASE-14397. ------------------------------------- Resolution: Fixed > PrefixFilter doesn't filter all remaining rows if the prefix is longer than > rowkey being compared > ------------------------------------------------------------------------------------------------- > > Key: HBASE-14397 > URL: https://issues.apache.org/jira/browse/HBASE-14397 > Project: HBase > Issue Type: Improvement > Components: Filters > Affects Versions: 2.0.0 > Reporter: Jianwei Cui > Assignee: Jianwei Cui > Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14397-trunk-v1.patch > > > The PrefixFilter will filter rowkey as: > {code} > public boolean filterRowKey(Cell firstRowCell) { > ... > int length = firstRowCell.getRowLength(); > if (length < prefix.length) return true; // ===> return directly if the > prefix is longer > .... > if ((!isReversed() && cmp > 0) || (isReversed() && cmp < 0)) { > passedPrefix = true; > } > filterRow = (cmp != 0); > return filterRow; > } > {code} > If the prefix is longer than the current rowkey, PrefixFilter#filterRowKey > will filter the rowkey directly without comparing, so that won't set > 'passedPrefix' flag even the current row is larger than the prefix. > For example, if there are three rows 'a', 'b' and 'c' in the table, and we > issue a scan request as: > {code} > hbase(main):001:0> scan 'test_table', {STARTROW => 'a', FILTER => > "(PrefixFilter ('aa'))"} > {code} > The region server will check the three rows before returning. In our > production, the user issue a scan with a PrefixFilter. The prefix is longer > than the rowkeys of following millions of rows, so the region server will > continue to check rows until hit a rowkey longer than the prefix. This make > the client easily timeout. To fix this case, it seems we need to compare the > prefix with the rowkey every serveral rows even when the prefix is longer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)