[
https://issues.apache.org/jira/browse/PIG-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516953#comment-14516953
]
Madhan Sundararajan Devaki commented on PIG-4513:
-------------------------------------------------
Please find below, the updated script.
input = load '/data/file1' using PigStorage('|') ;
cleanedData = foreach input generate "(chararray) (TRIM($0) == '\\\\N' ? NULL :
TRIM($0)) as $col1, (int) (TRIM($1) == '\\\\N' ? NULL : TRIM($1)) as $col2,
(chararray) (TRIM($2) == '\\\\N' ? NULL : TRIM($2)) as $col3";
STORE cleanedData INTO '/output/out1' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();
> Lines dropped in delimited text when they begin with null/no-data
> -----------------------------------------------------------------
>
> Key: PIG-4513
> URL: https://issues.apache.org/jira/browse/PIG-4513
> Project: Pig
> Issue Type: Bug
> Components: parser, piggybank
> Affects Versions: 0.12.0
> Environment: CDH5.2.x, CDH5.3.x
> Reporter: Madhan Sundararajan Devaki
> Priority: Blocker
> Fix For: 0.15.0
>
>
> When Pig (0.12) is used to process delimited text files (| delimited), lines
> that do not contain data in the first column are dropped.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)