[
https://issues.apache.org/jira/browse/PIG-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich resolved PIG-435.
--------------------------------
Resolution: Duplicate
This issue will be solved as part of the fix to
https://issues.apache.org/jira/browse/PIG-1188
> wrong columns produced if incomplete definition provided during load
> --------------------------------------------------------------------
>
> Key: PIG-435
> URL: https://issues.apache.org/jira/browse/PIG-435
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.2.0
> Reporter: Olga Natkovich
> Assignee: Daniel Dai
> Priority: Minor
> Fix For: 0.9.0
>
>
> Scrip:
> A = load 'studenttab10k' as (name); -- note that data has more than 1 column
> B = load 'votertab10k' as (name, age, reg, contrib);
> D = COGROUP A by name, B by name;
> E = foreach D generate flatten(A), flatten(B);
> F = foreach E generate registration, contr;
> dump F;
> The dump produces the wrong columns. This is because even though we declared
> only one column, we actually load all columns of A. So any place where we
> explicitely or implicitely use A.* as the case in flatten, we would produce
> the wrong results.
> The long term solution is actually to push projections into the load. Shorter
> term the proposal is to notice if the script uses A.* and stick a project
> after the load. Note that we don't need to do that if types are declared
> because there will be already casting foreach there.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira