Thanks Rita for logging the JIRA.

Do you want to provide a patch ?

On Sat, Oct 29, 2011 at 7:29 AM, Rita <[email protected]> wrote:

> Opened, https://issues.apache.org/jira/browse/HBASE-4702
>
>
> Please edit to your liking.
>
>
> On Sun, Oct 9, 2011 at 9:05 PM, Himanshu Vashishtha <
> [email protected]
> > wrote:
>
> > MapReduce support in HBase inherently provides parallelism such that
> > each Region is given to one mapper.
> >
> > Himanshu
> >
> > On Sun, Oct 9, 2011 at 6:44 PM, lars hofhansl <[email protected]>
> wrote:
> > > Be aware that the contract for a scan is to return all rows sorted by
> > rowkey, hence it cannot scan regions in parallel by default.I have not
> > played much HBase with MapReduce, but if order is not important you can
> to
> > split the scan into multiple scans.
> > >
> > >
> > > ----- Original Message -----
> > > From: Tom Goren <[email protected]>
> > > To: [email protected]
> > > Cc:
> > > Sent: Sunday, October 9, 2011 8:07 AM
> > > Subject: Re: speeding up rowcount
> > >
> > > lol - i just ran a rowcount via mapreduce, and it took 6 hours for 7.5
> > > million rows...
> > >
> > > On Sun, Oct 9, 2011 at 7:50 AM, Rita <[email protected]> wrote:
> > >
> > >> Hi,
> > >>
> > >> I have been doing a rowcount via mapreduce and its taking about 4-5
> > hours
> > >> to
> > >> count a 500million rows in a table. I was wondering if there are any
> map
> > >> reduce tunings I can do so it will go much faster.
> > >>
> > >> I have 10 node cluster, each node with 8CPUs with 64GB of memory. Any
> > >> tuning
> > >> advice would be much appreciated.
> > >>
> > >>
> > >> --
> > >> --- Get your facts first, then you can distort them as you please.--
> > >>
> > >
> > >
> >
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

Reply via email to