[ 
https://issues.apache.org/jira/browse/PIG-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-5335:
------------------------------
    Attachment: pig-5335-v02.patch

Uploading a new patch with comment updated.
 Basically with uid left with -1, 
{{TestColumnAliasConversion.testInvalidNestedProjection}} started failing at 
LogicalPlanBuilder time.

Just to summarize.
 For these type of script errors, there are essentially two locations catching 
the errors.
 (check1) LogicalPlanBuilder.buildForeachOp
 (check2) LogicalPlan.validate

Check1 is done at script reading time and errors out irrespective of if the 
corresponding relation is used or not. 
Check2 is done much later and is only performed for relations that are part of 
Dump/Store. 

 For this jira, I've tried a couple of patterns.
 (a) Move ProjectStarExpander and ProjStarInUdfExpander from check1 to check2. 
 This didn't work due to other part of the code depended upon these visitors 
within check1. 
 (b) Throw exception when invalid field is referenced in 
ProjectExpression.java. 
 This moved bunch of negative tests that used to fail in check2 to check1. 
(Meaning, user may start to see their pig scripts failing due to pig compiler 
catching errors in unused relations).
(c) Create invalid field with uid=-1. 
This mostly worked but TestColumnAliasConversion.testInvalidNestedProjection 
{code: title=TestColumnAliasConversion.java}
        String query = "A = load 'x' as (field);" +
                       "B = foreach A {" +
                       "  C = LIMIT invalidName 1;" +
                       "  generate C.foo;" +
                       "};";
{code}
started to fail at check1 instead of check2 due to 
{noformat}
<line 1, column 28> pig script failed to validate: 
org.apache.pig.impl.plan.PlanValidationException: ERROR 2271: Logical plan 
invalid state: invalid uid -1 in schema : invalidName#-1:bytearray
        at 
org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1066)
        at 
org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15896)
        at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933)
        at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
        at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
        at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
        at 
org.apache.pig.parser.ParserTestingUtils.generateLogicalPlan(ParserTestingUtils.java:76)
        at 
org.apache.pig.parser.TestColumnAliasConversion.validate(TestColumnAliasConversion.java:179)
        at 
org.apache.pig.parser.TestColumnAliasConversion.testInvalidNestedProjection(TestColumnAliasConversion.java:170)
Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 2271: 
Logical plan invalid state: invalid uid -1 in schema : invalidName#-1:bytearray
        at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.validate(SchemaResetter.java:243)
        at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:195)
        at 
org.apache.pig.newplan.logical.relational.LOLimit.accept(LOLimit.java:79)
        at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114)
        at 
org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1064)
{noformat}

(d) Then I gave a newuid for this fake field so that the above testcase would 
again fail at check2.

(d) is my patch (both v1 and v2).


 

> Error message from range projection completely misleading
> ---------------------------------------------------------
>
>                 Key: PIG-5335
>                 URL: https://issues.apache.org/jira/browse/PIG-5335
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Major
>         Attachments: pig-5335-v01.patch, pig-5335-v02.patch
>
>
> {code}
> A = load 'input.txt' as (a0,a1,a2,a3);
> B = FOREACH A GENERATE a0, a1, a2, a3;
> store B into '/tmp/deleteme';
> C = FOREACH A GENERATE a0, b1, a2, a3;
> D = FOREACH C GENERATE a0..a2;
> (end of script, no store, nothing)
> {code}
> Error message
> {panel}
> 2018-04-10 10:22:33,360 \[main] ERROR org.apache.pig.PigServer - exception 
> during parsing: Error during parsing. Invalid field projection. Projected 
> field \[a0] does not exist in schema: 
> a0:bytearray,a0:bytearray,a2:bytearray,a3:bytearray.
> {panel}
> At least two issues.
> # Error should be about FOREACH for C referencing non-existing field 'b1'.  
> But the error message is saying something about 'a0'.
> # Script itself is not using relation C and D at all.  It's confusing to see 
> errors coming out of unused relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to