[
https://issues.apache.org/jira/browse/PIG-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264609#comment-13264609
]
Dmitriy V. Ryaboy commented on PIG-2611:
----------------------------------------
David,
What Bill said stands -- saying an int is a long does not make it a long.
Conversion to a different type in Pig is done as follows:
{code}
with_longs = foreach with_ints generate (long) intfield as longfield;
{code}
PigStorage just doesn't care what you pass into it, as it calls toString()
pretty much regardless of what it gets. HBase actually looks at bits and treats
objects differently, hence the error. Any other usage of the field you declare
to be a type it is not in this fashion will also blow up (try passing it to a
UDF that takes a Long).
> HBaseStorage not casting correctly
> ----------------------------------
>
> Key: PIG-2611
> URL: https://issues.apache.org/jira/browse/PIG-2611
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.9.2
> Environment: Ubuntu 11.10, Hadoop 0.20.2, HBase 0.92.0
> Reporter: David Arthur
> Priority: Minor
> Labels: cast, hbase
>
> When loading data into HBase with HBaseStorage, there is unexpected behavior
> regarding record schema and casting.
> Here is the relevant code snippet:
> {code}
> B = group A by (time_tuple, some_scalar);
> C = foreach B {
> -- UDF to generate id (bytearray)
> generate id, flatten(group.$0), COUNT(A);
> }
> {code}
> At this point the schema for C is unknown, so I declare a schema with a
> foreach statement
> {code}
> D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int,
> $3 as date:int, $4 as count:int;
> {code}
> Even though I've declared C.$4 as an int, it is still a long (from the
> COUNT). When I go to insert into HBase I get a ClassCastException since the
> schema (int) does not match the actual tuple value (long). I can fix this by
> explicitly casting when I declare the schema.
> {code}
> D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int,
> $3 as date:int, (int)$4 as count:int;
> {code}
> Is this expected behavior? If not, is this an HBaseStorage issue - not
> honoring the schema before going off casting things?
> Cheers,
> David
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira