Hi, > If i understand your question correctly, if your interested in getting the > value FOO you should change your value filter to as below
Not quite. I want to get no results at all. I would expect the ValueFilter scan with 'binaryprefix:foo' to not return any cells, since the cell it found was not within the last N versions (where N=1 in my example) As it is now, I can't really trust that the result of a scan with a valuefilter is representative of the state of the table, so I would need to verify that none of the returned cells have a more recent version with a different value. This is the problem I would expect VERSIONS => 1 to get around. An yes, a major compaction will clean up the old cells, but that still gives me a (large) window where I'm getting junk back from a scan. I'm wondering if there's a way around this, so I can avoid filtering on clientside. I'd much rather let HBase do that, with its parallelization. On 23 January 2018 at 01:00, naresh Goud <[email protected]> wrote: > Hi, > > If i understand your question correctly, if your interested in getting the > value FOO you should change your value filter to as below > > scan 't1', { COLUMNS => 'f1:a', FILTER => "ValueFilter( =, > 'binaryprefix:FOO' )" } instead of binaryprefix:foo' > > If you query after before major compaction then your query with value filter > don't return any result binaryprefix:foo' > > > > Thank you, > Naresh > > > > > > On Mon, Jan 22, 2018 at 4:57 PM, Anders Ossowicki <[email protected]> wrote: >> >> Hi, >> >> When doing a scan with a ValueFilter, I get an old cell value out, >> even with VERSIONS => 1 set for the table. >> >> hbase(main):003:0> create 't1', 'f1' >> 0 row(s) in 1.8020 seconds >> hbase(main):005:0> put 't1', 'foo', 'f1:a', 'foo' >> 0 row(s) in 0.1260 seconds >> hbase(main):006:0> put 't1', 'foo', 'f1:a', 'FOO' >> 0 row(s) in 0.0070 seconds >> hbase(main):001:0> scan 't1' >> ROW COLUMN+CELL >> foo column=f1:a, >> timestamp=1516659855024, value=FOO >> 1 row(s) in 0.2260 seconds >> hbase(main):002:0> scan 't1', { COLUMNS => 'f1:a', FILTER => >> "ValueFilter( =, 'binaryprefix:foo' )" } >> ROW COLUMN+CELL >> foo column=f1:a, >> timestamp=1516659851593, value=foo >> 1 row(s) in 0.0600 seconds >> hbase(main):003:0> describe 't1' >> Table t1 is ENABLED >> t1 >> COLUMN FAMILIES DESCRIPTION >> {NAME => 'f1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => >> 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', >> TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', >> BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} >> >> This is on HBase 1.1.2 as shipped by HortonWorks. >> >> My understanding is that this will happen as long as there hasn't been >> a major compaction to clean up old cell versions. >> >> I'm wondering if I'm missing an obvious way to get what I want (only >> cells that would survive a major compaction), possibly one that would >> just work when VERSIONS => 1, or if I'll just have to do the scan >> without a valuefilter, and filter the data clientside. >> >> -- >> Anders Ossowicki > > -- Anders Ossowicki
