[ https://issues.apache.org/jira/browse/PIG-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852515#action_12852515 ]
Daniel Dai commented on PIG-1341: --------------------------------- Have a discussion with Alan and Richard, we felt that caster for BinStorage does not make sense. We don't know how to cast bytearray datatype for BinStorage. In the intermediate storage case, we will find the original loader, and use lineage for that loader to convert bytearray. But if user use the BinStorage directly, we have no idea what bytearray means. So the suggestion is we don't give caster to BinStorage. The implication is that if user want to use BinStorage as a temporary store, in some cases, it will fail. Here is a sample script which will be broken if we make this change: script 1: {code} a = load '1.txt'; b = order a by $0; store b into 'temp.out' using BinStorage(); -- store in BinStorage format with the datatype bytearray {code} script 2: {code} a = load 'temp.out' using BinStorage(); b = foreach a generate $0+$1; -- here we will need a caster, but BinStorage does not have it, we will fail {code} > BinStorage cannot convert DataByteArray to Chararray and results in > FIELD_DISCARDED_TYPE_CONVERSION_FAILED > ---------------------------------------------------------------------------------------------------------- > > Key: PIG-1341 > URL: https://issues.apache.org/jira/browse/PIG-1341 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.6.0 > Reporter: Viraj Bhat > Assignee: Richard Ding > Fix For: 0.7.0 > > Attachments: PIG-1341.patch > > > Script reads in BinStorage data and tries to convert a column which is in > DataByteArray to Chararray. > {code} > raw = load 'sampledata' using BinStorage() as (col1,col2, col3); > --filter out null columns > A = filter raw by col1#'bcookie' is not null; > B = foreach A generate col1#'bcookie' as reqcolumn; > describe B; > --B: {regcolumn: bytearray} > X = limit B 5; > dump X; > B = foreach A generate (chararray)col1#'bcookie' as convertedcol; > describe B; > --B: {convertedcol: chararray} > X = limit B 5; > dump X; > {code} > The first dump produces: > (36co9b55onr8s) > (36co9b55onr8s) > (36hilul5oo1q1) > (36hilul5oo1q1) > (36l4cj15ooa8a) > The second dump produces: > () > () > () > () > () > It also throws an error message: FIELD_DISCARDED_TYPE_CONVERSION_FAILED 5 > time(s). > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.