Re: ATTENTION: Feedback requested on the PARALLEL keyword

Dan Harvey Wed, 07 Jul 2010 06:32:54 -0700

This is fine from our use at Mendeley too, we currently set
that manually most of the time and it would be great to have it set
automatically in cases where it is not. This will be of most benefit to
users who do not understand the map reduce work system fully but still use
pig for various tasks.


Thinking about the way of setting it, is using the input size the best
option? or might the number of reducers in the cluster be worth looking at
too? as that can affect the performance of the job and dataset for a given
cluster.

Thanks,

On 2 July 2010 00:57, Dmitriy Ryaboy <[email protected]> wrote:

> We (Twitter) are fine with this change.
>
> On Thu, Jul 1, 2010 at 4:45 PM, Aravind Srinivasan
> <[email protected]>wrote:
>
> > Dear Pig Users,
> >
> > My name is Aravind Srinivasan and am the Product Manager for Pig at
> Yahoo.
> > The Pig team would love to get your feedback on the proposal below.
> > Basically we are trying to figure out if this enhancement would break
> > backwards compatibility for your system and if so, what are your thoughts
> on
> > the trade-off between the cost and the benefit. Please drop me an e-mail
> (
> > [email protected]) if you have an opinion on this.
> >
> > Summary:
> > Currently, if PARALLEL is not specified, the default value is 1 which
> most
> > of time is not what users want and ends up causing some problems in the
> > clusters in the past. The proposal is to use some very basic heuristic
> based
> > on the input size to set a better value. This can be issues for users who
> > expect just a single part file in the output.
> >
> > Jira for your reference:
> > https://issues.apache.org/jira/browse/PIG-1249
> >
> > Thanks,
> > Aravind
> >
> >
> >
> >
>

-- 
Dan Harvey | Datamining Engineer
www.mendeley.com/profiles/dan-harvey

Mendeley Limited | London, UK | www.mendeley.com
Registered in England and Wales | Company Number 6419015

Re: ATTENTION: Feedback requested on the PARALLEL keyword

Reply via email to