[ https://issues.apache.org/jira/browse/HADOOP-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Izaak Rubin updated HADOOP-1606: -------------------------------- Attachment: HADOOP-1606.patch A patch containing updates to RowFilterSet, RowFilterInterface, and the tests for RowFilterSet (as specified). Included in the patch are various minor modifications to other filters such as PageRowFilter, RegExpRowFilter, and their respective tests. There is also a minor modification to the scanner portion of HRegion, in which RowFilterInterface.acceptedRow(Text) is called -- this has been replaced by rowProcessed(boolean, Text). All tests have compiled successfully on my computer. > Updated Implementation of RowFilterSet, RowFilterInterface > ---------------------------------------------------------- > > Key: HADOOP-1606 > URL: https://issues.apache.org/jira/browse/HADOOP-1606 > Project: Hadoop > Issue Type: Improvement > Components: contrib/hbase > Reporter: Izaak Rubin > Priority: Minor > Attachments: HADOOP-1606.patch > > > Unit tests on RowFilterSet revealed a problem with it's handling of nested > state-maintaining filters. RowFilterSet returned as soon as possible for its > implementations of filter and filterNotNull. This came at the cost of it not > always calling every one of it's filters. Skipping these filters was > problematic, particularly when a filter changes it's state when called to > filter. As a result, later calls to filterAllRemaining() were > non-deterministic (with an unordered set) or dependent on set ordering at > best. > With much input from Michael Stack and James Kennedy, the problem has been > resolved as follows: the RowFilterInterface has been updated to contain a > boolean processAlways() method that states whether or not this filter MUST be > called in any call to the filter hierarchy. Filters that require their state > to be updated immediately upon every filter call (via a call to their filter > methods), such as WhileMatchRowFilter (see HADOOP-1579), will return true for > processAlways(). RowFilterSet will ensure that these filters always have > their filtering methods called, whether or not they affect the final decision. > The patch proposed by this issue will make the necessary changes to > RowFilterSet and RowFilterInterface, in addition to adding the tests for > RowFilterSet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.