Yes, something like: List<Result> multiGet(List<Get> gets, int maxThreads)
In general, you should assume that HTable instances are not thread-safe. Behind the scenes, HTables are sharing TCP connections to RS, but from client POV you should have one HTable per thread per table. > -----Original Message----- > From: Michael Segel [mailto:[email protected]] > Sent: Wednesday, August 25, 2010 3:54 AM > To: [email protected] > Subject: RE: Best way to get multiple non-sequential rows > > > Jonathan, > > Ok, that makes some sense... > So you would have some method mget(fetchKeyList,numthreads) returning > resultList[]. > > So what's thread safe these days? > > -Mike > > > From: [email protected] > > To: [email protected] > > Subject: RE: Best way to get multiple non-sequential rows > > Date: Wed, 25 Aug 2010 03:52:38 +0000 > > > > Michael, > > > > MultiGet is about performing a set of Get operations in parallel from > the client. So it buys you potential performance benefits from the > concurrency/distribution of your operations. > > > > Roughly, you would bucket the gets according to their region and > regionserver. Then spawn a thread for each RS and fire off the Gets > concurrently. > > > > If I have 100 Gets to perform on a random set of keys, assuming each > get takes 10ms, doing them sequentially will take 1 second. Other > factors and RS concurrency aside, with MultiGet on a 10 node cluster, > the total time would be reduced to 100ms. With 50 nodes, 20ms. > > > > JG > > > > > > > -----Original Message----- > > > From: Michael Segel [mailto:[email protected]] > > > Sent: Tuesday, August 24, 2010 7:53 PM > > > To: [email protected] > > > Subject: RE: Best way to get multiple non-sequential rows > > > > > > > > > Igor, > > > > > > What does this really buy you? > > > > > > I'm trying to figure out a use case that would show a benefit from > just > > > fetching the rows individually. Since the rows are not contiguous, > the > > > odds of the next row you want being in cache are going to slight to > > > most likely not. ;-) > > > > > > Can you give a use case where having a 'multi-get' will make life > > > easier? > > > > > > Thx > > > > > > -Mike > > > > > > > > > > Date: Wed, 25 Aug 2010 07:17:13 +0600 > > > > Subject: Re: Best way to get multiple non-sequential rows > > > > From: [email protected] > > > > To: [email protected] > > > > > > > > Thanks Igor, I will have a look at it. > > > > > > > > /Imran > > > > > > > > On Tue, Aug 24, 2010 at 10:36 PM, Igor Ranitovic > <[email protected]> > > > wrote: > > > > > Take a look at > > > > > https://issues.apache.org/jira/browse/HBASE-1845 > > > > > > > > > > As an HBase user, multi gets is something that I have been > looking > > > forward > > > > > to for some time now. If there is enough interest it would be > great > > > if this > > > > > becomes part of 0.90. > > > > > > > > > > Take care, > > > > > i. > > > > > > > > > > Imran M Yousuf wrote: > > > > >> > > > > >> Hi, > > > > >> > > > > >> I am using the HBase client API to interact with HBase. I have > > > noticed > > > > >> that HTableInterface has operations such as put(List<Put>), > > > > >> delete(List<Delete>), but there is no similar method for Get. > > > Using > > > > >> scan it is possible to load a range of rows, i.e. sequential > rows. > > > My > > > > >> question is - > > > > >> how would it be most efficient to load N non-sequential rows? > > > > >> > > > > >> Currently I am using get(Get) method N times. > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Imran M Yousuf > > > > Blog: http://imyousuf-tech.blogs.smartitengineering.com/ > > > > Mobile: +880-1711402557 > > > >
