Thanks for the suggestions Michael. On Tue, Aug 24, 2010 at 5:37 PM, Michael Segel <[email protected]> wrote: > > Hi, > > Non sequential rows? > > Short answer... it depends. :-) > > Longer answer... how 'non-sequential' ? > > If you're using a key that is hashed (SHA-1) then your rows will be fairly > random and 'non-sequential. > Here you're best bet is to fetch each row via a get(). In order to do the > get you have to know the specific key so the fetch should be fairly quick and > consistent regardless of the size of the database. (near linear scalability). > This works great if you know your key. > > If you're using some key that isn't hashed but the rows aren't sequential, > you may want to do a range scan and then drop > the rows that are not needed. This may be faster in some specific situations > where all of your data is within one or two regions of a large, large table. > (But its so specific, I don't know of the value in terms of a generic query.) > > An extreme and bad example... suppose you want to find all of the shops along > a specific street and in part of the key you include the street side but is > also based on the address. > If you did a scan, you'd end up with a list where you may want every other > entry. So here it would be faster to do a sequential scan with a partial key > to put a boundary on which regions to scan. (Again this is a bad example.) > If you also write your own custom filter, you can get it to return only the > rows you want. > > Again, I apologize for the bad example... it was the first thing I could > think of before I finished my first cup of coffee in the morning. > > HTH > > -Mike > > >> Date: Tue, 24 Aug 2010 09:35:26 +0600 >> Subject: Best way to get multiple non-sequential rows >> From: [email protected] >> To: [email protected] >> >> Hi, >> >> I am using the HBase client API to interact with HBase. I have noticed >> that HTableInterface has operations such as put(List<Put>), >> delete(List<Delete>), but there is no similar method for Get. Using >> scan it is possible to load a range of rows, i.e. sequential rows. My >> question is - >> how would it be most efficient to load N non-sequential rows? >> >> Currently I am using get(Get) method N times. >> >> -- >> Imran M Yousuf >> Blog: http://imyousuf-tech.blogs.smartitengineering.com/ >> Mobile: +880-1711402557 >
-- Imran M Yousuf Entrepreneur & CEO Smart IT Engineering Ltd. Dhaka, Bangladesh Email: [email protected] Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557
