[
https://issues.apache.org/jira/browse/HBASE-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell resolved HBASE-3477.
-----------------------------------
Resolution: Cannot Reproduce
Reopen or file new issue if relevant with modern HBase versions
> Filter for deprecated mapred APIs doesn't work when the table has few rows
> --------------------------------------------------------------------------
>
> Key: HBASE-3477
> URL: https://issues.apache.org/jira/browse/HBASE-3477
> Project: HBase
> Issue Type: Bug
> Components: Filters
> Affects Versions: 0.90.0
> Environment: Linux (Debian), master 1, slaves 2
> Reporter: Yifeng Jiang
>
> It seems that the filters will not be invoke when there are only a few data
> in the table.
> I added some logs to the org.apache.hadoop.hbase.filte. PrefixFilter, and has
> a MyInputFormat extends hbase.mapred.TableInputFormat, the deprecated mapred
> APIs.
> The log added to PrefixFilter
> {noformat}
> public boolean filterRowKey(byte[] buffer, int offset, int length) {
> log.info("TODO: filterRowKey invoked");
> if (buffer == null || this.prefix == null) {
> log.info("TODO: #1 of filter");
> return true;
> }
> if (length < prefix.length) {
> ...
> }
> {noformat}
> This is the code in my InputFormat's configure method.
> {noformat}
> byte[] prefix = Bytes.toBytes("001");
> Filter filter = new PrefixFilter(prefix);
> setRowFilter(filter);
> {noformat}
> And the job setup code.
> {noformat}
> job.setInputFormat(MyInputFormat.class);
> FileInputFormat.addInputPaths(job, "my_table_in_hbase");
> job.set(TableInputFormat.COLUMN_LIST, "data:");
> {noformat}
> When I put lots of data (> 500,000) in the table, the filter works well, but
> when I put only a few data (<100) in the table, it seems that the filter will
> not be invoked, and the log in the filter has no output either.
> This is the log output when lots of data in the table
> {noformat}
> 2011-01-25 16:43:59,568 INFO org.apache.hadoop.hbase.filter.PrefixFilter:
> TODO: default constructor
> 2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter:
> TODO: filterRowKey invoked
> 2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter:
> TODO: #3 of filter
> 2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter:
> TODO: filterAllRemaining invoked
> 2011-01-25 16:44:01,729 INFO org.apache.hadoop.hbase.filter.PrefixFilter:
> TODO: filterAllRemaining invoked
> 2011-01-25 16:44:01,729 INFO org.apache.hadoop.hbase.filter.PrefixFilter:
> TODO: filterAllRemaining invoked
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)