Thejas M Nair commented on PIG-485:

This bug can cause confusion about the place of error , for example -

grunt> L = load 'students.txt' as (n:chararray,s:chararray,a:int,s:float);
grunt> LL = load 'students2.txt' as (nn:chararray,sn:chararray,an:int,sn:float);
grunt>  J = join L by a, LL by an ;
2009-04-07 14:56:37,331 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1007: Found duplicates in schema. LL::sn: 2 columns, L::s: 2 columns. Please 
alias the columns with unique names.

It appears as if the error is in Join, caused by column names being same across 
join inputs. But the problem is actually caused by duplicate column aliases in 
load statements.

It should give an error immediately after the load statement with duplicate 
column aliases are entered (,or at least warn if someone gives two columns in a 
relation with same name, immediately after the load statement is entered).

> Column names in schema are not unique
> -------------------------------------
>                 Key: PIG-485
>                 URL: https://issues.apache.org/jira/browse/PIG-485
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>            Priority: Minor
> Duplicate column names are allowed in relations. When subsequent statements 
> refer to the alias, it is ambiguous pick the right column.
> {code}
> grunt> a = load 'a' as (name, age, gpa);
> grunt> b = foreach a generate name, name;
> grunt> describe b;
> b: {name: bytearray,name: bytearray}
> {code}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to