[ https://issues.apache.org/jira/browse/PIG-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-5335: ------------------------------ Attachment: pig-5335-v02.patch Uploading a new patch with comment updated. Basically with uid left with -1, {{TestColumnAliasConversion.testInvalidNestedProjection}} started failing at LogicalPlanBuilder time. Just to summarize. For these type of script errors, there are essentially two locations catching the errors. (check1) LogicalPlanBuilder.buildForeachOp (check2) LogicalPlan.validate Check1 is done at script reading time and errors out irrespective of if the corresponding relation is used or not. Check2 is done much later and is only performed for relations that are part of Dump/Store. For this jira, I've tried a couple of patterns. (a) Move ProjectStarExpander and ProjStarInUdfExpander from check1 to check2. This didn't work due to other part of the code depended upon these visitors within check1. (b) Throw exception when invalid field is referenced in ProjectExpression.java. This moved bunch of negative tests that used to fail in check2 to check1. (Meaning, user may start to see their pig scripts failing due to pig compiler catching errors in unused relations). (c) Create invalid field with uid=-1. This mostly worked but TestColumnAliasConversion.testInvalidNestedProjection {code: title=TestColumnAliasConversion.java} String query = "A = load 'x' as (field);" + "B = foreach A {" + " C = LIMIT invalidName 1;" + " generate C.foo;" + "};"; {code} started to fail at check1 instead of check2 due to {noformat} <line 1, column 28> pig script failed to validate: org.apache.pig.impl.plan.PlanValidationException: ERROR 2271: Logical plan invalid state: invalid uid -1 in schema : invalidName#-1:bytearray at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1066) at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15896) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.ParserTestingUtils.generateLogicalPlan(ParserTestingUtils.java:76) at org.apache.pig.parser.TestColumnAliasConversion.validate(TestColumnAliasConversion.java:179) at org.apache.pig.parser.TestColumnAliasConversion.testInvalidNestedProjection(TestColumnAliasConversion.java:170) Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 2271: Logical plan invalid state: invalid uid -1 in schema : invalidName#-1:bytearray at org.apache.pig.newplan.logical.optimizer.SchemaResetter.validate(SchemaResetter.java:243) at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:195) at org.apache.pig.newplan.logical.relational.LOLimit.accept(LOLimit.java:79) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114) at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1064) {noformat} (d) Then I gave a newuid for this fake field so that the above testcase would again fail at check2. (d) is my patch (both v1 and v2). > Error message from range projection completely misleading > --------------------------------------------------------- > > Key: PIG-5335 > URL: https://issues.apache.org/jira/browse/PIG-5335 > Project: Pig > Issue Type: Bug > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Major > Attachments: pig-5335-v01.patch, pig-5335-v02.patch > > > {code} > A = load 'input.txt' as (a0,a1,a2,a3); > B = FOREACH A GENERATE a0, a1, a2, a3; > store B into '/tmp/deleteme'; > C = FOREACH A GENERATE a0, b1, a2, a3; > D = FOREACH C GENERATE a0..a2; > (end of script, no store, nothing) > {code} > Error message > {panel} > 2018-04-10 10:22:33,360 \[main] ERROR org.apache.pig.PigServer - exception > during parsing: Error during parsing. Invalid field projection. Projected > field \[a0] does not exist in schema: > a0:bytearray,a0:bytearray,a2:bytearray,a3:bytearray. > {panel} > At least two issues. > # Error should be about FOREACH for C referencing non-existing field 'b1'. > But the error message is saying something about 'a0'. > # Script itself is not using relation C and D at all. It's confusing to see > errors coming out of unused relations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)