rowFilter is added to filter list which doesn't contain other filters. Maybe the snippet doesn't contain all the code in your class ?
On Thu, Jul 13, 2017 at 5:26 PM, S L <[email protected]> wrote: > I don't understand why my regex doesn't work when scanning hbase. > Everything looks good to me but for some reason, it's returning all keys > when it should just return the ones I'm requesting > > Scan scan = new Scan(); > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > scan.setCaching(limit); > scan.setCacheBlocks(false); > scan.setTimeRange(start, end); > FilterList filters = new FilterList(); > Filter rowFilter = new RowFilter(CompareFilter.CompareOp.EQUAL, new > RegexStringComparator("100_.*_\\d{10}")); > filters.addFilter(rowFilter); > scan.setFilter(filters); > > TableMapReduceUtil.initTableMapperJob(tableName, scan, MTTRMapper.class, > Text.class, IntWritable.class, job); > > The rowkey is stored as a string in hbase. The rowkey is in the format of > hash_servername_timestamp, e.g. > > 0_myserver.mydomain.com_1234567890 > > The hash can be any number from 0-199. In the above filter, I just want to > get all elements with hash = 100 but for some reason, the scan job appears > to return other rowkeys in addition to the ones with hash = 100. > > I've tried this with jar versions 1.0.1 and 1.2.0-cdh5.7.2. What am I > doing wrong that's making the regex not work? >
