[ 
https://issues.apache.org/jira/browse/PIG-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224953#comment-13224953
 ] 

Travis Crawford commented on PIG-2574:
--------------------------------------

This is somewhat related to PIG-2573. In that bug I'm interested in making the 
default size-based estimator work with HCatalog. In this bug Bill is interested 
in using additional data to estimate the number of reducers.

For example, if you want to produce counts for some field in a large data set, 
you may just want one reducer. However, if your input data size is large you'll 
get a large number of reducers too. A plugin could look at historical stats for 
this job and choose an appropriate number of reducers.
                
> Make reducer estimator plugable
> -------------------------------
>
>                 Key: PIG-2574
>                 URL: https://issues.apache.org/jira/browse/PIG-2574
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>
> I'd like to refactor the logic contained in this method into a pluggable 
> interface:
> {noformat}
> static int JobControlCompiler.estimateNumberOfReducers(Configuration conf, 
> List<POLoad> lds);
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to