[
https://issues.apache.org/jira/browse/PIG-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530527#comment-13530527
]
Jonathan Coveney commented on PIG-3093:
---------------------------------------
One thing also to look into (once the initial patch is done) is to make sure
that the data is correct.
> Self join + realias results in schema errors
> --------------------------------------------
>
> Key: PIG-3093
> URL: https://issues.apache.org/jira/browse/PIG-3093
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.11, 0.12
> Reporter: Jonathan Coveney
> Assignee: Jonathan Coveney
> Priority: Critical
> Fix For: 0.12
>
>
> So this one took a while to isolate, but is pretty crazy.
> {code}
> A = load 'a' as (field1:chararray);
> B = foreach A generate *;
> C = join A by field1, B by field1;
> D = foreach C generate A::field1 as field2, B::field1;
> describe D;
> /*
> D: {
> field2: chararray,
> B::field1: chararray
> }
> */
> E = foreach D generate field2, field1;
> describe E;
> /*
> E: {
> B::field1: chararray,
> B::field1: chararray
> }
> */
> F = foreach E generate field2;
> store F into 'fail';
> -- <file cristian_simpler.pig, line 20, column 4> Invalid field projection.
> Projected field [field2] does not exist in schema:
> B::field1:chararray,B::field1:chararray.
> {code}
> If you take a look at that code snippet, that is pretty nuts! Since the 2
> fields come from the same original table, renaming one causes issues with
> both. WUT. The even weirder part is not that they both get renamed, but that
> they both become the unrenamed value.
> Interestingly, flipping the value of the projection changes the order of the
> output, so it looks like it's whatever the final reference is. ie
> {code}
> A = load 'a' as (field1:chararray);
> B = foreach A generate *;
> C = join A by field1, B by field1;
> D = foreach C generate B::field1, A::field1 as field2;
> describe D;
> E = foreach D generate field2, field1;
> describe E;
> F = foreach E generate field2;
> store F into 'fail';
> {code}
> results in
> {code}
> D: {
> B::field1: chararray,
> field2: chararray
> }
> E: {
> field2: chararray,
> field2: chararray
> }
> 2012-12-13 00:13:10,045 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
> 1025:
> <file simplest.pig, line 8, column 23> Invalid field projection. Projected
> field [field2] does not exist in schema: field2:chararray,field2:chararray.
> {code}
> This seems to imply the solution: make copies of the Schema. I added a test
> and will hopefully have a patch soon.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira