[
https://issues.apache.org/jira/browse/PIG-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917050#action_12917050
]
Richard Ding commented on PIG-1656:
---
We need to make it clear how the output schema of TOBAG is generated. For
example, in the first case, the type is preserved in the inner schema:
{code}
grunt> a = load 'input' as (a0:int, a1:int);
grunt> b = foreach a generate TOBAG(a0, a1);
grunt> describe b;
b: {{int}}
{code}
but not in the second case:
{code}
grunt> a = load 'input' as (a0:int, a1:int);
grunt> c = group a by a0 ;
grunt> b = foreach c generate TOBAG(a.a0, a.a1);
grunt> describe b;
b: {{NULL}}
{code}
> TOBAG udfs ignores columns with null value; it does not use input type to
> determine output schema
> ---
>
> Key: PIG-1656
> URL: https://issues.apache.org/jira/browse/PIG-1656
> Project: Pig
> Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1656.1.patch
>
>
> TOBAG udf ignores columns with null value
> {code}
> R4= foreach B generate $0, TOBAG( id, null, id,null );
> grunt> dump R4;
> 1000{(1),(1)}
> 1000{(2),(2)}
> 1000{(3),(3)}
> 1000{(4),(4)}
> {code}
> TOBAG does not use input type to determine output schema
> {code}
> grunt> B1 = foreach B generate TOBAG( 1, 2, 3);
> grunt> describe B1;
> B1: {{null}}
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.