[ 
https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481841#comment-13481841
 ] 

Jonathan Coveney commented on PIG-2937:
---------------------------------------

This is definitely important and useful.

To my eye, the way that this should work is that in any case where you don't 
have a schema (in this case, generated_field inside of the GENERATE) we should 
do our best to fill it in. In the case of a binary conditional, etc, we know 
the return type, so that gives us the type, and the field name (ie 
generated_field) would give us the name.

I think that this is not a deep change, but it is a tricky one as getting Pig 
to thread through Schema information like this that isn't currently threaded 
through can be tricky.
                
> generated field in nested foreach does not inherit the variable name as the 
> field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the 
> field_c that is from the original relation. However, Pig currently doesn't 
> assign the field name by default. It'd be nice if we can assign the variable 
> name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to