[
https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846038#action_12846038
]
Pradeep Kamath commented on PIG-1257:
-------------------------------------
In the following case in inputData the record will end with \r won't it?
(notice the \r in the middle after 2)
{code}
"1\t2\r3\t4", // '\r' case - this will be split into two tuples
{code}
> PigStorage per the new load-store redesign should support splitting of bzip
> files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2,
> blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch,
> PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on
> TextInputFormat for reading data. TextInputFormat has support for reading
> bzip data but without support for splitting bzip files. In pig 0.6, splitting
> was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.