This is another case, because this is not a prefix scan. This is inclusive scan. In this case of course you should use some techniques, like indexing or full scan with filter. But this is not a real time solution for any noticeable collections of found keys (you can achive ~4ms per record, and for example for 1000 rows you get 4 sec).
2010/8/24 Michelan Arendse <[email protected]>: > This works wonderfully have a look at the code cause it's still a bit slow > and I need it to be lighting fast. > IndexedTable table = new IndexedTable(_hbManager.getConfiguration(), > Bytes.toBytes("Table")); > ResultScanner scanner = table.getIndexedScanner("IndexId", null, null, null, > filter, > new byte[][] {Bytes.toBytes("Colum_Family:column1")}); Under the hood IndexedTable perform Get for each found row in index, so you can't achive very fast index scans. Only denormalization can help. > > -----Original Message----- > From: Andrey Stepachev [mailto:[email protected]] > Sent: 23 August 2010 09:11 PM > To: [email protected] > Subject: Re: Scanning half a key or value in HBase > > If my table is huge do i get full scan? > I you want to get good performance on random read > you really need start and stop keys. > PrefixFIlters are usable in compound filters. If you want > only one range (like 123_*), you must use start/stop keys. > > 2010/8/23 Samuru Jackson <[email protected]>: >> Hi, >> >> I do it this way: >> >> The variable searchValue is my Prefix like in your case 123 would be: >> >> searchValue = "123"; >> >> PrefixFilter prefixFilter = new PrefixFilter(Bytes.toBytes(searchValue)); >> Scan scan = new Scan(); >> scan.addFamily(Bytes.toBytes(this.REF_FAM)); >> scan.setFilter(prefixFilter); >> ResultScanner resultScanner = hBaseTable.getScanner(scan); >> >> Now you can iterate over the resultScanner. >> >> Is this what you were looking for? >> >> /SJ >> >> >> >> >> On Mon, Aug 23, 2010 at 6:00 AM, Michelan Arendse <[email protected]> >> wrote: >>> Hi, >>> >>> Thanks for the responses but it's still not what I am really looking for. >>> >>> The row id looks something like: number_string so it would be 123_foo, >> 123_foo2 123_foo3. >>> So now I want to find all the foo's that are related to the first half of >> the key which is "123". >>> >>> Also I can't add start row if I do not know where 123 starts. And I can't >> search for the start row, as I need this to be very fast. >>> >>> Thanks. >>> >>> >>> -----Original Message----- >>> From: Ryan Rawson [mailto:[email protected]] >>> Sent: 17 August 2010 09:01 PM >>> To: [email protected] >>> Subject: Re: Scanning half a key or value in HBase >>> >>> Hey, >>> >>> One thing to watch out for is ascii with separator variable length >>> keys, you would think if your key structure was: >>> >>> foo:bar >>> >>> starting at 'foo' and ending at 'foo:' might give you only keys which >>> start with 'foo:' but this doesn't work like that. You also get keys >>> like: >>> foo123:bar >>> >>> you must start the scan at 'foo:' but you can't just end it at 'foo;' >>> (since next(:) == ';' in ascii), this has to do with the ordering of >>> ASCII, for a reference look at http://www.asciitable.com/ >>> >>> The bug-free solution is to start your scan at 'foo:' and use a prefix >>> filter set to 'foo:'. >>> >>> If you are scanning fixed-width keys, eg: binary conversions of longs, >>> then the [start,start+1) solution works. >>> >>> On Tue, Aug 17, 2010 at 5:59 AM, Andrey Stepachev <[email protected]> >> wrote: >>>> Use scan where start key is <first_half_of_key> itself as bytearray, and >>>> stop key is <first_half_of_key> with last byte in bytearray + 1. >>>> >>>> example >>>> abc% should be scan(abc, abd) >>>> >>>> 2010/8/17 Michelan Arendse <[email protected]>: >>>>> Hi >>>>> >>>>> I am not sure if this is possible in HBase. What I am trying to do is >> scan on a HBase table with something similar to how SQL would do it. >>>>> e.g. SELECT * >>>>> FROM <table> >>>>> WHERE <primary key> LIKE '<first_half_of_key>%' ; >>>>> >>>>> So as you can see from above I want to scan the table with only part of >> the row key, since the key is a combination of 2 fields in the table. >>>>> >>>>> Regards, >>>>> Michelan Arendse >>>>> >>>>> >>>>> >>>> >>> >> >
