[
https://issues.apache.org/jira/browse/PIG-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Geza Radics updated PIG-4242:
-----------------------------
Description:
XMLLoader finds the first matching position for the required tag, but applies
this offset for all following lines as well until the closing tag. This causes
content losses for indented xml formats with multiline contents such as
wikipedia:
--- example input ---
{code:xml}
<page>You have
not missed it</page>
{code}
--- ouput ---
{code:xml}
<page>You have missed it</page>
{code}
was:
XMLLoader finds the first matching position for the required tag, but applies
this offset for all following lines as well until the closing tag. This causes
content losses for indented xml formats with multiline contents such as
wikipedia:
--- example input ------
<page>You have
not missed it</page>
--- ouput --------------
<page>You have missed it</page>
> For indented xmls with multiline content (e.g. wikipedia) XMLLoader cuts out
> the begining of every line
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-4242
> URL: https://issues.apache.org/jira/browse/PIG-4242
> Project: Pig
> Issue Type: Bug
> Components: piggybank
> Reporter: Geza Radics
> Attachments: XMLLoaderMissingContent.patch
>
>
> XMLLoader finds the first matching position for the required tag, but applies
> this offset for all following lines as well until the closing tag. This
> causes content losses for indented xml formats with multiline contents such
> as wikipedia:
> --- example input ---
> {code:xml}
> <page>You have
> not missed it</page>
> {code}
> --- ouput ---
> {code:xml}
> <page>You have missed it</page>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)