[ https://issues.apache.org/jira/browse/PIG-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047234#comment-14047234 ]
Rohini Palaniswamy commented on PIG-4043: ----------------------------------------- In that case we can keep this property. In this patch, I only see shims/src/hadoop20 HadoopShims.java modified and not hadoop23. You seem to have missed including it in the patch. Can you also modify one of the existing MiniCluster unit tests and pass this property? I will create a separate jira later to avoid the two arrays created by PIG-3913. The two arrays could cause more jobs to fail that were passing in 0.12 and we have to get rid of it. > JobClient.getMap/ReduceTaskReports() causes OOM for jobs with a large number > of tasks > ------------------------------------------------------------------------------------- > > Key: PIG-4043 > URL: https://issues.apache.org/jira/browse/PIG-4043 > Project: Pig > Issue Type: Bug > Reporter: Cheolsoo Park > Assignee: Cheolsoo Park > Fix For: 0.14.0 > > Attachments: PIG-4043-1.patch, PIG-4043-2.patch, heapdump.png > > > With Hadoop 2.4, I often see Pig client fails due to OOM when there are many > tasks (~100K) with 1GB heap size. > The heap dump (attached) shows that TaskReport[] occupies about 80% of heap > space at the time of OOM. > The problem is that JobClient.getMap/ReduceTaskReports() returns an array of > TaskReport objects, which can be huge if the number of task is large. -- This message was sent by Atlassian JIRA (v6.2#6252)