[
https://issues.apache.org/jira/browse/PIG-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623972#action_12623972
]
Santhosh Srinivasan commented on PIG-379:
-----------------------------------------
The describe statement kicks of the logical plan -> type checker -> optimizer
process. During the logical plan optimization the schema of each operator is
reset. When the schema of each operator is recomputed, the computation uses the
attributes of the operator along with the information about its inputs. User
defined schemas specified with the as clause are not annotated as such in each
operator. As a result, when the schema is reset in the logical optimizer, this
information is lost resulting in incorrect schemas.
There are multiple items that we need to consider:
1. Annotate each relational operator and expression operator with an attributed
to denote presence of user specified schemas
2. Checks to ensure compatibility of user specified schemas with the
generated/inferred schemas, i.e.,
a. if the user specifies incorrect types, then perform appropriate checks
and type promotions
b. if the schema is a mismatch then flag it as an error
3. For complex constants, the schema computation is a bit complex and involves
type promotions, null introductions, etc.
> describe interfiers with name resolution
> ----------------------------------------
>
> Key: PIG-379
> URL: https://issues.apache.org/jira/browse/PIG-379
> Project: Pig
> Issue Type: Bug
> Affects Versions: types_branch
> Reporter: Olga Natkovich
> Priority: Critical
> Fix For: types_branch
>
>
> If I ran the following script:
> A = load 'studenttab10k' as (name: chararray, age: int, gpa: float);
> B = foreach A generate name, age;
> describe B;
> C = filter B by age > 30;
> describe C;
> D = group C by name;
> describe D;
> I get the error below. Also notice that the schema of C no longer have names:
> {name: chararray,age: integer}
> {chararray,integer}
> java.io.IOException: Invalid alias: name in {chararray,integer}
> at org.apache.pig.PigServer.registerQuery(PigServer.java:254)
> at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:422)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
> at org.apache.pig.Main.main(Main.java:302)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid
> alias: name in {chararray,integer}
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5179)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5048)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3357)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3254)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3208)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3117)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3043)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3009)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:2911)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.GroupItem(QueryParser.java:1548)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CogroupClause(QueryParser.java:1468)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:751)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:569)
> at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:378)
> at
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:251)
> If I remove describe, I don't see any errors
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.