Niels Basjes created PIG-4397:
---------------------------------
Summary: CSVExcelStorage incorrect output if last field value is
null
Key: PIG-4397
URL: https://issues.apache.org/jira/browse/PIG-4397
Project: Pig
Issue Type: Bug
Environment: Running the Pig version bundled with HDP 2.1.2:
0.12.1.2.1.2.0-402
Reporter: Niels Basjes
Priority: Critical
I have the following input:
{code}
one two
three
four
{code}
I run this code
{code}
Lines =
LOAD 'test.log' USING PigStorage(' ')
AS ( First:chararray , Second:chararray );
DUMP Lines;
STORE Lines INTO 'Lines'
USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t', 'NO_MULTILINE',
'WINDOWS', 'WRITE_OUTPUT_HEADER');
{code}
The output from the DUMP is correct:
{code}
(one,two)
(three,)
(,four)
{code}
The output from the CSVExcelStorage is incorrect:
{code}
First Second
one two
three three
four
{code}
The problem is that if the last field is a null then the previous value is
repeated incorrectly (in this case 'three').
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)