Yes, something like:

List<Result> multiGet(List<Get> gets, int maxThreads)

In general, you should assume that HTable instances are not thread-safe.  
Behind the scenes, HTables are sharing TCP connections to RS, but from client 
POV you should have one HTable per thread per table.

> -----Original Message-----
> From: Michael Segel [mailto:[email protected]]
> Sent: Wednesday, August 25, 2010 3:54 AM
> To: [email protected]
> Subject: RE: Best way to get multiple non-sequential rows
> 
> 
> Jonathan,
> 
> Ok, that makes some sense...
> So you would have some method mget(fetchKeyList,numthreads) returning
> resultList[].
> 
> So what's thread safe these days?
> 
> -Mike
> 
> > From: [email protected]
> > To: [email protected]
> > Subject: RE: Best way to get multiple non-sequential rows
> > Date: Wed, 25 Aug 2010 03:52:38 +0000
> >
> > Michael,
> >
> > MultiGet is about performing a set of Get operations in parallel from
> the client.  So it buys you potential performance benefits from the
> concurrency/distribution of your operations.
> >
> > Roughly, you would bucket the gets according to their region and
> regionserver.  Then spawn a thread for each RS and fire off the Gets
> concurrently.
> >
> > If I have 100 Gets to perform on a random set of keys, assuming each
> get takes 10ms, doing them sequentially will take 1 second.  Other
> factors and RS concurrency aside, with MultiGet on a 10 node cluster,
> the total time would be reduced to 100ms. With 50 nodes, 20ms.
> >
> > JG
> >
> >
> > > -----Original Message-----
> > > From: Michael Segel [mailto:[email protected]]
> > > Sent: Tuesday, August 24, 2010 7:53 PM
> > > To: [email protected]
> > > Subject: RE: Best way to get multiple non-sequential rows
> > >
> > >
> > > Igor,
> > >
> > > What does this really buy you?
> > >
> > > I'm trying to figure out a use case that would show a benefit from
> just
> > > fetching the rows individually. Since the rows are not contiguous,
> the
> > > odds of the next row you want being in cache are going to slight to
> > > most likely not. ;-)
> > >
> > > Can you give a use case where having a 'multi-get' will make life
> > > easier?
> > >
> > > Thx
> > >
> > > -Mike
> > >
> > >
> > > > Date: Wed, 25 Aug 2010 07:17:13 +0600
> > > > Subject: Re: Best way to get multiple non-sequential rows
> > > > From: [email protected]
> > > > To: [email protected]
> > > >
> > > > Thanks Igor, I will have a look at it.
> > > >
> > > > /Imran
> > > >
> > > > On Tue, Aug 24, 2010 at 10:36 PM, Igor Ranitovic
> <[email protected]>
> > > wrote:
> > > > > Take a look at
> > > > > https://issues.apache.org/jira/browse/HBASE-1845
> > > > >
> > > > > As an HBase user, multi gets is something that I have been
> looking
> > > forward
> > > > > to for some time now. If there is enough interest it would be
> great
> > > if this
> > > > > becomes part of 0.90.
> > > > >
> > > > > Take care,
> > > > > i.
> > > > >
> > > > > Imran M Yousuf wrote:
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> I am using the HBase client API to interact with HBase. I have
> > > noticed
> > > > >> that HTableInterface has operations such as put(List<Put>),
> > > > >> delete(List<Delete>), but there is no similar method for Get.
> > > Using
> > > > >> scan it is possible to load a range of rows, i.e. sequential
> rows.
> > > My
> > > > >> question is -
> > > > >> how would it be most efficient to load N non-sequential rows?
> > > > >>
> > > > >> Currently I am using get(Get) method N times.
> > > > >>
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Imran M Yousuf
> > > > Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> > > > Mobile: +880-1711402557
> > >
> 

Reply via email to