[
https://issues.apache.org/jira/browse/PIG-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Gates updated PIG-312:
---------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
intcast.patch checked in.
> Casting a byte array that contains a double value to an int results in a null
> pointer
> -------------------------------------------------------------------------------------
>
> Key: PIG-312
> URL: https://issues.apache.org/jira/browse/PIG-312
> Project: Pig
> Issue Type: Bug
> Affects Versions: types_branch
> Reporter: Alan Gates
> Assignee: Alan Gates
> Fix For: types_branch
>
> Attachments: intcast.patch
>
>
> {code}
> a = load 'myfile' as (name, age, gpa);
>
> c = foreach a generate age * 10, (int)gpa * 2;
>
>
> store c into 'outfile';
> {code}
> The values in gpa are doubles. The issue is that they are read as byte
> arrays and then when the user tries to cast them to an int, the system does a
> direct cast from byte array to int, which results in a null. First of all,
> it should result in a zero, not a null (unless the underlying value is null).
> Second, we have to clarify semantics here. gpa was never officially
> declared to be a double, so trying to do a cast directly from bytearray to
> int is a reasonable thing to do. But users may not see it that way. Do we
> want to first cast numbers to double and then to anything subsequent to avoid
> this? Or should we force users to write this as (int)(double)gpa * 2 so we
> know to first cast to double and then int? In the interest of speed
> (especially considering the rarity of doubles in most data) I'd vote for the
> latter.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.