[ 
https://issues.apache.org/jira/browse/PIG-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017717#comment-13017717
 ] 

Olga Natkovich commented on PIG-1745:
-------------------------------------

Just to clarify what is happenning (and we are adding information to the 0.9 
documentation)

The way to get previous functionality is to specify a converter for BinStorage 
to use to do the casts:

a = load 'g/part*' using BinStorage('Utf8StorageConverter') as (id, d:bag{t:(v, 
s)});
b = foreach a generate (double)id, flatten(d);
dump b;

The UTF8StorageConverter is provided by default in Pig.

We are require users to specify the converter explicitely to make sure that 
wrong results are not returned in case the data is not in the format 
UTF8Converter can understand.



> Disable converting bytes loading from BinStorage
> ------------------------------------------------
>
>                 Key: PIG-1745
>                 URL: https://issues.apache.org/jira/browse/PIG-1745
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1745-1.patch, PIG-1745-2.patch, PIG-1745-3.patch
>
>
> If we load bytes from BinStorage, we don't actually know how we get these 
> bytes originally, and we will not have a way to cast those bytes. Ideally we 
> shall encode caster into BinStorage data file, but we are not there yet. 
> Currrently bytesToXXX methods for BinStorage is wrong and it results 
> unexpected errors. Eg.
> {code}
> a = load '1.txt' as (a0, a1, a2);
> store a into '1.bin' as BinStorage();
> a = load '1.bin' using BinStorage as (a0, a1, a2);
> b = foreach a generate (long)a0;
> dump b;
> {code}
> The code will run but produce wrong data. It's less confusing if we throw an 
> exception in this case.
> Release Notes:
> Pig will throw exception in the case we want to convert bytes loading from 
> BinStorage

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to