[ 
https://issues.apache.org/jira/browse/PIG-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917050#action_12917050
 ] 

Richard Ding commented on PIG-1656:
-----------------------------------


We need to make it clear how the output schema of TOBAG is generated. For 
example, in the first case, the type is preserved in the inner schema:

{code}
grunt> a = load 'input' as (a0:int, a1:int);
grunt> b = foreach a generate TOBAG(a0, a1);
grunt> describe b;
b: {{int}}
{code}

but not in the second case:

{code}
grunt> a = load 'input' as (a0:int, a1:int);
grunt> c = group a by a0 ;
grunt> b = foreach c generate TOBAG(a.a0, a.a1);
grunt> describe b;
b: {{NULL}}
{code}

> TOBAG  udfs ignores columns with null value;  it does not use input type to 
> determine output schema
> ---------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1656
>                 URL: https://issues.apache.org/jira/browse/PIG-1656
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1656.1.patch
>
>
> TOBAG udf ignores columns with null value
> {code}
> R4= foreach B generate $0,  TOBAG( id, null, id,null );
> grunt> dump R4;
> 1000    {(1),(1)}
> 1000    {(2),(2)}
> 1000    {(3),(3)}
> 1000    {(4),(4)}
> {code}
>  TOBAG does not use input type to determine output schema
> {code}
> grunt> B1 = foreach B generate TOBAG( 1, 2, 3);         
> grunt> describe B1;
> B1: {{null}}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to