[ 
https://issues.apache.org/jira/browse/PIG-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213708#comment-15213708
 ] 

Rohini Palaniswamy commented on PIG-4844:
-----------------------------------------

Changes done:
   - pig.pigContext takes up lot of space in the payload as it contains all the 
config. Only ship what is necessary (local mode and log4j properties)
   - Auto increase AM memory if number of vertices is > 30 or number of outputs 
per vertex is > 10.
   - Force fetch inputs before starting outputs so that we can choose to 
allocate more space for buffers by setting 
tez.task.scale.memory.input-output-concurrent=false which is a new option in 
Tez.

> Tez AM runs out of memory when vertex has high number of outputs
> ----------------------------------------------------------------
>
>                 Key: PIG-4844
>                 URL: https://issues.apache.org/jira/browse/PIG-4844
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>
>   AM runs out of memory when trying to respond to getTask() calls from 
> container for a vertex with large number of outputs (usually the case with 
> multi-query when you group by on multiple dimensions).  Problem is with the 
> size of payload config associated with PigProcessor, Input and Output. When 
> there is >10 outputs size of the payload considerably increases causing 
> memory pressure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to