[ 
https://issues.apache.org/jira/browse/PIG-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4397:
----------------------------
    Attachment: PIG-4397-1.patch

> CSVExcelStorage incorrect output if last field value is null
> ------------------------------------------------------------
>
>                 Key: PIG-4397
>                 URL: https://issues.apache.org/jira/browse/PIG-4397
>             Project: Pig
>          Issue Type: Bug
>         Environment: Running the Pig version bundled with HDP 2.1.2:   
> 0.12.1.2.1.2.0-402
>            Reporter: Niels Basjes
>            Priority: Critical
>             Fix For: 0.15.0
>
>         Attachments: PIG-4397-1.patch
>
>
> I have the following input:
> {code}
> one two
> three
>  four
> {code}
> I run this code
> {code}
> Lines =
>     LOAD 'test.log' USING PigStorage(' ') 
>     AS ( First:chararray , Second:chararray );
> DUMP Lines;
> STORE Lines INTO 'Lines'
> USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t', 'NO_MULTILINE', 
> 'WINDOWS', 'WRITE_OUTPUT_HEADER');
> {code}
> The output from the DUMP is correct:
> {code}
> (one,two)
> (three,)
> (,four)
> {code}
> The output from the CSVExcelStorage is incorrect:
> {code}
> First   Second
> one     two
> three   three
>         four
> {code}
> The problem is that if the last field is a null then the previous value is 
> repeated incorrectly (in this case 'three').



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to