I'm running CDH4.6.0 with HBase 0.94.15-cdh4.6.0. I was wondering, does the key need to be serialized? Currently my keys are strings, not raw bytes.
thanks, liam On Thu, Apr 24, 2014 at 6:28 PM, Ted Yu <[email protected]> wrote: > Which HBase version are you using ? > > Cheers > > > On Thu, Apr 24, 2014 at 6:24 PM, Liam Slusser <[email protected]> wrote: > > > Hey All - > > > > I'm having some strange results using FuzzyRowFilter. I'm programming in > > jython for that extra bit of adventure. > > > > My hbase key looks something like [random 10bytes][servicetype > > 12bytes][timestamp 10bytes] = 32 bytes total. For an example key > > e23d4ac4b90002000100011398388474 > > > > So the following code will find the above key: > > > > filter = FuzzyRowFilter([ Pair(array('b', > > > > > "e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00"), > > array('b', > > [0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]))]) > > > > But I'm only able to match at the beginning of the key, never the middle > or > > at the end. > > > > This will not find the above key: > > > > filter = FuzzyRowFilter([ Pair(array('b', > > > > > "e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x004"), > > array('b', > > [0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0]))]) > > > > Am I doing something wrong? Is there a better way to search for keys? > > Really I'm going to want to search on the 12-byte service-type. > > > > Here is the full jython code: > > > > from array import array > > from org.apache.hadoop.hbase.util import Pair > > from org.apache.hadoop.hbase import HBaseConfiguration > > from org.apache.hadoop.hbase.client import HBaseAdmin, HTable, Scan > > from org.apache.hadoop.hbase.filter import FuzzyRowFilter > > > > conf = HBaseConfiguration() > > filter = FuzzyRowFilter([ Pair(array('b', > > > > > "e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00"), > > array('b', > > [0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1])) ]) > > > > scan = Scan() > > scan.setFilter(filter) > > table = HTable(conf,'mytable') > > s = table.getScanner(scan) > > > > while True: > > r = s.next() > > if not r: > > break > > else: > > print r > > >
