Sergey created PIG-3292:
---------------------------

             Summary: Logical plan invalid state: duplicate uid in schema 
during self-join to get cross product
                 Key: PIG-3292
                 URL: https://issues.apache.org/jira/browse/PIG-3292
             Project: Pig
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.10.0
         Environment: CDH 4.2
            Reporter: Sergey


Hi.
Looks like PIG-3020
but works in a different way.
Our pig version is: 
Apache Pig version 0.10.0-cdh4.2.0 (rexported) 
compiled Feb 15 2013, 12:20:54

Accoring to release note, PIG-3020 is included into CDH 4.2 dist
http://archive.cloudera.com/cdh4/cdh/4/pig-0.10.0-cdh4.2.0.CHANGES.txt

The problem:
We want to do self join to get cross-product
{code}
a = load '/input' as (key, x);

a_group = group a by key;
b = foreach a_group {
  y = a.x;
  pair = cross a.x, y;
  generate flatten(pair);
}

dump b;
{code}

And an error:
{code}
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2270: Logical plan invalid 
state: duplicate uid in schema : 1-7::x#16:bytearray,y::x#16:bytearray
{code}

Here is workaround :)
{code}
a = load '/input' as (key, x:int);

a_group = group a by key;
b = foreach a_group {
  y = foreach a generate -(-x);
  pair = cross a.x, y;
  generate flatten(pair);
}

dump b;
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to