[ 
https://issues.apache.org/jira/browse/PIG-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3346:
-------------------------------

    Attachment: PIG-3346.patch

The attached patch includes the following changes:
* Adds a new property {{pig.maxCombinedSplitNum}}. By default, it is set to 
Long.MAX_VALUE.
* Updates the logic of {{MapRedUtil.getCombinePigSplits()}} to take the number 
of combined splits into account.
* Adds a new test case to {{TestSplitCombine}}.
* Updates the document regarding the new property.

Test done:
* ant test-commit
* ant test -Dtestcase=TestSplitCombine

Thanks!
                
> New property that controls the number of combined splits
> --------------------------------------------------------
>
>                 Key: PIG-3346
>                 URL: https://issues.apache.org/jira/browse/PIG-3346
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12
>
>         Attachments: PIG-3346.patch
>
>
> Currently, the size of combined splits can be configured by the 
> {{pig.maxCombinedSplitSize}} property.
> Although this works fine most of time, it can lead to a undesired situation 
> where a single mapper ends up loading a lot of combined splits. Particularly, 
> this is bad if Pig uploads them from S3.
> So it will be useful if the max number of combined splits can be configured 
> via a property something like {{pig.maxCombinedSplitNum}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to