[
https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481841#comment-13481841
]
Jonathan Coveney commented on PIG-2937:
---------------------------------------
This is definitely important and useful.
To my eye, the way that this should work is that in any case where you don't
have a schema (in this case, generated_field inside of the GENERATE) we should
do our best to fill it in. In the case of a binary conditional, etc, we know
the return type, so that gives us the type, and the field name (ie
generated_field) would give us the name.
I think that this is not a deep change, but it is a tricky one as getting Pig
to thread through Schema information like this that isn't currently threaded
through can be tricky.
> generated field in nested foreach does not inherit the variable name as the
> field name
> --------------------------------------------------------------------------------------
>
> Key: PIG-2937
> URL: https://issues.apache.org/jira/browse/PIG-2937
> Project: Pig
> Issue Type: Bug
> Reporter: Feng Peng
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
> generated_field = (field_a is null ? '-' : someUDF(field_b));
> GENERATE
> field_c,
> generated_field
> ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the
> field_c that is from the original relation. However, Pig currently doesn't
> assign the field name by default. It'd be nice if we can assign the variable
> name as the default field name.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira