[
https://issues.apache.org/jira/browse/PIG-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819428#comment-15819428
]
Rohini Palaniswamy edited comment on PIG-4260 at 1/11/17 10:47 PM:
-------------------------------------------------------------------
Ran into this when one user had below code in his UDF.
{code}
List<Tuple> data = Arrays.asList(result.getSamples()); // Returns a ArrayList
implementation which is different from java.util.ArrayList
DataBag sampleBag = BagFactory.getInstance().newDefaultBag(data);
{code}
Arrays.asList creates a ArrayList which does not implement the clear() method.
So it throws error when we call mContents.clear() when spilling. Since we
swallow that error, next time when that bag is accessed it has twice the
records - in mContents and in spill file.
With this patch, if there is any error we revert the spill. So it would either
end up working fine or fail hitting OOM.
was (Author: rohini):
Ran into this when one user had below code in his UDF.
{code}
List<Tuple> data = Arrays.asList(result.getSamples()); // Returns a ArrayList
implementation which is different from java.util.ArrayList
DataBag sampleBag = BagFactory.getInstance().newDefaultBag(data);
{code}
Arrays.asList creates a ArrayList which does not implement the clear() method.
So it throws error when we call mContents.clear() when spilling. Since we
swallow that error, next time when that bag is accessed it has twice the
records - in mContents and in spill file.
> SpillableMemoryManager.spill eats Exception
> -------------------------------------------
>
> Key: PIG-4260
> URL: https://issues.apache.org/jira/browse/PIG-4260
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Daniel Dai
> Assignee: Rohini Palaniswamy
> Fix For: 0.17.0, 0.16.1
>
> Attachments: PIG-4260-1.patch
>
>
> Found by Rohini when working on PIG-4250.
> bq. If there is a exception during spill() called by SpillableMemoryManager
> it will be just ignored. We do not track that there was an exception during
> spill and throw that back when the bag is accessed next time
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)