[ 
https://issues.apache.org/jira/browse/PIG-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clément Stenac updated PIG-3170:
--------------------------------

    Attachment: pig-staticreferences-to-context.diff
    
> Pig keeps static references to Hadoop's Context after end of task
> -----------------------------------------------------------------
>
>                 Key: PIG-3170
>                 URL: https://issues.apache.org/jira/browse/PIG-3170
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>            Reporter: Clément Stenac
>            Priority: Minor
>         Attachments: pig-staticreferences-to-context.diff
>
>
> Through the PigStatusReporter, and the ProgressableReporter, when a Pig MR 
> task is done, static references are kept to Hadoop's Context object.
> Additionally, the PigCombiner also keeps a static reference, apparently 
> without using it.
> When the JVM is reused between MR tasks, it can cause large memory 
> overconsumption, with a peak during the creation of the next task, because 
> while MR is creating the next task (in MapTask.<init> for example), we have 
> both contexts (with  their associated buffers) allocated at once.
> This problem is especially important when using a Combiner, because the 
> ReduceContext of a Combiner contains references to large sort buffers.
> The specifics of our case were:
> * 20 GB input data, divided in 85 map tasks
> * Very simple Pig script: LOAD A, FILTER A, GROUP A, FOREACH group generate 
> MAX(field), STORE  
> * MapR distribution, which automatically computes Xmx for mappers at 800MB
> * At the end of the first task, the ReduceContext contains more than 400MB of 
> byte[]
> * Systematic OOM in MapTask.<init> on subsequent VM reuse
> * At least -Xmx1200m was required to get the job to complete
> * With attached patch, -Xmx600m is enough
> While a workaround by increasing Xmx is possible, I think the large 
> overconsumption and the complexity of debugging the issue (because the OOM 
> actually happens at the very beginning of the task, before the first byte of 
> data has been processed) warrants fixing it.
> The attached patch makes sure that PigStatusReporter and ProgressableReporter 
> drop their reference to the Context in the cleanup phases of the task.
> No new test is included because I don't really think it's possible to write a 
> unit test, the issue being not "binary"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to