[ 
https://issues.apache.org/jira/browse/PIG-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-5384:
------------------------------
    Attachment: pig-5384-v01-halfway.patch

Attaching {{pig-5384-v01-halfway.patch}} to give you an idea.  I only changed 
DefaultDataBag but if I were to take this route, I need to make similar changes 
to other bags.

For handling spill failures, calling System.exit() is the reliable way but I 
think setting mContents to null would let the reader reliably fail (unless 
users have a custom Bag that is doing something very unique).

> OOM while spilling large bag 
> -----------------------------
>
>                 Key: PIG-5384
>                 URL: https://issues.apache.org/jira/browse/PIG-5384
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Major
>         Attachments: pig-5384-v01-halfway.patch
>
>
> One of the common OOM issue in Pig is, Pig hitting OOM while trying to spill 
> a large bag. Current solutions is to give higher heapsize or tweak 
> {noformat}
> pig.spill.memory.usage.threshold.fraction
> pig.spill.collection.threshold.fraction
> pig.spill.unused.memory.threshold.size
> {noformat}
> and make sure spilling starts early enough.  These params are still critical 
> but wondering if any improvement can be made to increase the chances of 
> avoiding OOM while spilling a single large bag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to