[
https://issues.apache.org/jira/browse/PIG-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vivek Padmanabhan updated PIG-1839:
-----------------------------------
Attachment: PIG-1839-1.patch
Attaching the initial patch.
Please note that I have modified the existing test case to assert for the
correct number of tuples .
> piggybank: XMLLoader will always add an extra empty tuple even if no tags are
> matched
> -------------------------------------------------------------------------------------
>
> Key: PIG-1839
> URL: https://issues.apache.org/jira/browse/PIG-1839
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Vivek Padmanabhan
> Attachments: PIG-1839-1.patch
>
>
> The XMLLoader in piggy bank always add an empty tuple. Everytime this has to
> be filtered out. Instead the same could be done by the loader itself.
> Consider the below script :
> a= load 'a.xml' using org.apache.pig.piggybank.storage.XMLLoader('name');
> dump a;
> b= filter a by $0 is not null;
> dump b;
> The output of first dump is :
> (<name> foobar </name>)
> (<name> foo </name>)
> (<name> justname </name>)
> ()
> The output of second dump is :
> (<name> foobar </name>)
> (<name> foo </name>)
> (<name> justname </name>)
> Again another case is if I dont have a matching tag , still the loader will
> generate the empty tuple.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira