[
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243921#comment-13243921
]
Russell Jurney commented on PIG-2614:
-------------------------------------
I will write unit tests, but I'm having trouble getting this to work in my
actual error case. In my case, when one record is bad, all subsequent records
are bad... the offset screws up or something in the read.
I'll make unit tests, though!
> AvroStorage crashes on LOADING a single bad error
> -------------------------------------------------
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
> Issue Type: Bug
> Components: piggybank
> Affects Versions: 0.10, 0.11
> Reporter: Russell Jurney
> Labels: avro, avrostorage, bad, book, cutting, doug, for, my,
> pig, sadism
> Fix For: 0.10, 0.11
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing
> fields. This is very bad on 'big data,' where bad records are inevitable.
> See discussion at
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
> for more theory.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira