[ https://issues.apache.org/jira/browse/HBASE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831599#action_12831599 ]
stack commented on HBASE-2198: ------------------------------ Its by design. High-level in the Scan object you say what you are interested in and then the filter works against the Scan specification. I like your suggestion A above. Can you make a patch for that? I think you need a new filter, one that allows filtering against one set of columns but only returns columns out of a different set of columns if a row passes a filter. > SingleColumnValueFilter should be able to find the column value even when > it's not specifically added as input on the scan. > --------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-2198 > URL: https://issues.apache.org/jira/browse/HBASE-2198 > Project: Hadoop HBase > Issue Type: Improvement > Components: filters > Affects Versions: 0.20.3 > Reporter: Ferdy > > Whenever applying a SingleColumnValueFilter to a Scan that has specific > columns as it's input (but not the column to be checked in the Filter), the > Filter won't be able to find the value that it should be checking. > For example, let's say we want to do a scan, but we only need COLUMN_2 > columns. Furthermore, we only want rows that have a specific value for > COLUMN_1. Using the following code won't do the trick: > Scan scan = new Scan(); > scan.addColumn(FAMILY, COLUMN_2); > SingleColumnValueFilter filter = new SingleColumnValueFilter(FAMILY, > COLUMN_1, CompareOp.EQUAL, TEST_VALUE); > filter.setFilterIfMissing(true); > scan.setFilter(filter); > However, we can make it work when specifically also adding the tested column > as an input column: > scan.addColumn(FAMILY, COLUMN_1); > Is this by design? Personally I think that adding a filter with columns tests > should not bother the user to check that it's also on the input. It is prone > to bugs. > I suggest either one of 3 solutions: > A) Update the Javadoc of Filter / SingleColumnValueFilter / possibly other > affecting Filters to indicate this behaviour. > B) Fix the problem client-side (i.e. prior to using a Scan object, it should > check that the corresponding inputs for filters are set, but only if the user > has configured specific input columns in the first place). This is perhaps > inefficient performance-wise, because unnecessary inputs columns are returned > to the user. (Inputs that would only have to be used for filtering). > C) Fix the problem server-side. This would me most efficient, because the > input column would only be read to do filtering at the regionserver. > What do you think? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.