Michael,

MultiGet is about performing a set of Get operations in parallel from the 
client.  So it buys you potential performance benefits from the 
concurrency/distribution of your operations.

Roughly, you would bucket the gets according to their region and regionserver.  
Then spawn a thread for each RS and fire off the Gets concurrently.

If I have 100 Gets to perform on a random set of keys, assuming each get takes 
10ms, doing them sequentially will take 1 second.  Other factors and RS 
concurrency aside, with MultiGet on a 10 node cluster, the total time would be 
reduced to 100ms. With 50 nodes, 20ms.

JG


> -----Original Message-----
> From: Michael Segel [mailto:[email protected]]
> Sent: Tuesday, August 24, 2010 7:53 PM
> To: [email protected]
> Subject: RE: Best way to get multiple non-sequential rows
> 
> 
> Igor,
> 
> What does this really buy you?
> 
> I'm trying to figure out a use case that would show a benefit from just
> fetching the rows individually. Since the rows are not contiguous, the
> odds of the next row you want being in cache are going to slight to
> most likely not. ;-)
> 
> Can you give a use case where having a 'multi-get' will make life
> easier?
> 
> Thx
> 
> -Mike
> 
> 
> > Date: Wed, 25 Aug 2010 07:17:13 +0600
> > Subject: Re: Best way to get multiple non-sequential rows
> > From: [email protected]
> > To: [email protected]
> >
> > Thanks Igor, I will have a look at it.
> >
> > /Imran
> >
> > On Tue, Aug 24, 2010 at 10:36 PM, Igor Ranitovic <[email protected]>
> wrote:
> > > Take a look at
> > > https://issues.apache.org/jira/browse/HBASE-1845
> > >
> > > As an HBase user, multi gets is something that I have been looking
> forward
> > > to for some time now. If there is enough interest it would be great
> if this
> > > becomes part of 0.90.
> > >
> > > Take care,
> > > i.
> > >
> > > Imran M Yousuf wrote:
> > >>
> > >> Hi,
> > >>
> > >> I am using the HBase client API to interact with HBase. I have
> noticed
> > >> that HTableInterface has operations such as put(List<Put>),
> > >> delete(List<Delete>), but there is no similar method for Get.
> Using
> > >> scan it is possible to load a range of rows, i.e. sequential rows.
> My
> > >> question is -
> > >> how would it be most efficient to load N non-sequential rows?
> > >>
> > >> Currently I am using get(Get) method N times.
> > >>
> > >
> > >
> >
> >
> >
> > --
> > Imran M Yousuf
> > Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> > Mobile: +880-1711402557
> 

Reply via email to