[
https://issues.apache.org/jira/browse/PIG-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677987#comment-15677987
]
Koji Noguchi commented on PIG-5067:
-----------------------------------
bq. any particular issue did you see?
The one I looked at was from user unknowingly mixing two incompatible types in
union-onschema and order-by showing completely off results.
Also, I prefer to avoid mixed type inside bytearray when possible.
Following result is counter-intuitive to me (although I understand why it's
like that).
{code:title=test.pig}
A = load '1.txt' as (a1:int);
B = load '2.txt' as (a1:chararray);
C = UNION onschema A, B; '
describe C;
D = GROUP C by a1;
dump D;
{code}
{noformat}
% cat 1.txt
1
% cat 2.txt
1
% pig test.pig
C: {a1: bytearray}
(1,{(1)})
(1,{(1)})
{noformat}
> Revisit union on numeric type and chararray to bytearray
> --------------------------------------------------------
>
> Key: PIG-5067
> URL: https://issues.apache.org/jira/browse/PIG-5067
> Project: Pig
> Issue Type: Bug
> Reporter: Koji Noguchi
>
> In PIG-2071, we changed the behavior of union on numeric and chararray to
> bytearray.
> This itself was always failing at runtime until we changed to skip the
> bytearray typecast for union-onschema in PIG-3270.
> (For union, it still fails with typecast to bytearray error. )
> Now, seeing users getting inconsistent results due to this union-ed bytearray.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)