Daniel Dai updated PIG-861:

    Attachment: PIG-861-1.patch

The problem is caused by a bug in BinStorage.java which erroneously interprets 
character \255 in the binary stream as EOF. Tested on the original queries and 
the patch fix the problem. No unit test is included since this patch does not 
introduce any new feature.

> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>                 Key: PIG-861
>                 URL: https://issues.apache.org/jira/browse/PIG-861
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.4.0
>         Attachments: PIG-861-1.patch
> Some script using POJoinPackage loses records when processing large amount of 
> input data. We do not see this problem in smaller input. We can reproduce 
> this problem, however, the dataset for the test case is too big to be 
> included here. We suspect that POJoinPackage causes the problem.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to