[ 
https://issues.apache.org/jira/browse/PIG-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859453#action_12859453
 ] 

Alan Gates commented on PIG-1341:
---------------------------------

I agree with Ashutosh.  We do not want BinStorage tracking data lineage.  In 
the case where Pig is using BinStorage (or whatever) for moving data between MR 
jobs then Pig can figure out the correct cast function to use and apply it.  
For cases such as the one here where users are storing data using BinStorage 
and then in a separate Pig Latin script reading it (and thus loosing the type 
information) it is the users responsibility to correctly cast the data before 
storing it in BinStorage.  As a general case I do not think we can expect load 
and store functions to track data lineage across Pig Latin scripts.

I propose we close this as won't fix.

> BinStorage cannot convert DataByteArray to Chararray and results in 
> FIELD_DISCARDED_TYPE_CONVERSION_FAILED
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1341
>                 URL: https://issues.apache.org/jira/browse/PIG-1341
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Richard Ding
>         Attachments: PIG-1341.patch
>
>
> Script reads in BinStorage data and tries to convert a column which is in 
> DataByteArray to Chararray. 
> {code}
> raw = load 'sampledata' using BinStorage() as (col1,col2, col3);
> --filter out null columns
> A = filter raw by col1#'bcookie' is not null;
> B = foreach A generate col1#'bcookie'  as reqcolumn;
> describe B;
> --B: {regcolumn: bytearray}
> X = limit B 5;
> dump X;
> B = foreach A generate (chararray)col1#'bcookie'  as convertedcol;
> describe B;
> --B: {convertedcol: chararray}
> X = limit B 5;
> dump X;
> {code}
> The first dump produces:
> (36co9b55onr8s)
> (36co9b55onr8s)
> (36hilul5oo1q1)
> (36hilul5oo1q1)
> (36l4cj15ooa8a)
> The second dump produces:
> ()
> ()
> ()
> ()
> ()
> It also throws an error message: FIELD_DISCARDED_TYPE_CONVERSION_FAILED 5 
> time(s).
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to