Re: Scan vs map-reduce

Jean-Marc Spaggiari Mon, 14 Apr 2014 05:10:44 -0700

This might help you: http://phoenix.incubator.apache.org/


JM
Le 2014-04-14 07:53, "Li Li" <[email protected]> a écrit :

> I need to get about 20,000 rows from the table. the table is about
> 1,000,000 rows.
> my first version is using 20,000 Get and I found it's very slow. So I
> modified it to a scan and filter unrelated rows in the client.
> maybe I should write a coprocessor. btw, is there any filter available
> for me? something like sql statement where rowkey in('abc', 'abd'
> ....). a very long in statement
>
> On Mon, Apr 14, 2014 at 7:46 PM, Jean-Marc Spaggiari
> <[email protected]> wrote:
> > Hi Li Li,
> >
> > If you have more than one region, might be useful. MR will scan all the
> > regions in parallel. If you do a full scan from a client API with no
> > parallelism, then the MR job might be faster. But it will take more
> > resources on the cluster and might impact the SLA of the other clients,
> if
> > any,
> >
> > JM
> >
> >
> > 2014-04-14 2:42 GMT-04:00 Mohammad Tariq <[email protected]>:
> >
> >> Well, it depends. Could you please provide some more details?It will
> help
> >> us in giving a proper answer.
> >>
> >> Warm Regards,
> >> Tariq
> >> cloudfront.blogspot.com
> >>
> >>
> >> On Mon, Apr 14, 2014 at 11:38 AM, Li Li <[email protected]> wrote:
> >>
> >> > I have a full table scan which cost about 10 minutes. it seems a
> >> > bottleneck for our application. if use map-reduce to rewrite it. will
> >> > it be faster?
> >> >
> >>
>

Re: Scan vs map-reduce

Reply via email to