[ https://issues.apache.org/jira/browse/HBASE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ferdy updated HBASE-2198: ------------------------- Status: Patch Available (was: Open) Allright. I guess you're right about it requiring a new sort of filter. Nevertheless, it could be accomplished by adding a method on SingleColumnValueFilter, something like skipTestedColumn(boolean) that defaults to 'false'. I will attach a patch for suggestion A. (I think this should cover it, feel free to modify at your likings). > SingleColumnValueFilter should be able to find the column value even when > it's not specifically added as input on the scan. > --------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-2198 > URL: https://issues.apache.org/jira/browse/HBASE-2198 > Project: Hadoop HBase > Issue Type: Improvement > Components: filters > Affects Versions: 0.20.3 > Reporter: Ferdy > Attachments: HBASE-2198.patch > > > Whenever applying a SingleColumnValueFilter to a Scan that has specific > columns as it's input (but not the column to be checked in the Filter), the > Filter won't be able to find the value that it should be checking. > For example, let's say we want to do a scan, but we only need COLUMN_2 > columns. Furthermore, we only want rows that have a specific value for > COLUMN_1. Using the following code won't do the trick: > Scan scan = new Scan(); > scan.addColumn(FAMILY, COLUMN_2); > SingleColumnValueFilter filter = new SingleColumnValueFilter(FAMILY, > COLUMN_1, CompareOp.EQUAL, TEST_VALUE); > filter.setFilterIfMissing(true); > scan.setFilter(filter); > However, we can make it work when specifically also adding the tested column > as an input column: > scan.addColumn(FAMILY, COLUMN_1); > Is this by design? Personally I think that adding a filter with columns tests > should not bother the user to check that it's also on the input. It is prone > to bugs. > I suggest either one of 3 solutions: > A) Update the Javadoc of Filter / SingleColumnValueFilter / possibly other > affecting Filters to indicate this behaviour. > B) Fix the problem client-side (i.e. prior to using a Scan object, it should > check that the corresponding inputs for filters are set, but only if the user > has configured specific input columns in the first place). This is perhaps > inefficient performance-wise, because unnecessary inputs columns are returned > to the user. (Inputs that would only have to be used for filtering). > C) Fix the problem server-side. This would me most efficient, because the > input column would only be read to do filtering at the regionserver. > What do you think? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.