Geza Radics created PIG-4242:
--------------------------------

             Summary: For indented xmls with multiline content (e.g. wikipedia) 
XMLLoader cuts out the begining of every line
                 Key: PIG-4242
                 URL: https://issues.apache.org/jira/browse/PIG-4242
             Project: Pig
          Issue Type: Bug
          Components: piggybank
            Reporter: Geza Radics


XMLLoader finds the first matching position for the required tag, but applies 
this offset for all following lines as well until the closing tag. This causes 
content losses for indented xml formats with multiline contents such as 
wikipedia:

--- example input ------
    <page>You have 
not missed it</page>

--- ouput --------------
<page>You have missed it</page>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to