[jira] [Comment Edited] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744494#comment-16744494 ] Koji Noguchi edited comment on PIG-5375 at 1/16/19 10:08 PM: - For my reference (and maybe for the reviewer) attaching how the plan evolved with vertexgroup by the UnionOptimizer before my patch. (using the script from the description) https://issues.apache.org/jira/secure/attachment/12955153/pig-5375.png Note the bottom of the vertex group output {panel} Tez vertex group scope-87 <- [{color:#FF}scope-56{color}, scope-53, {color:#FF}scope-56{color}, scope-54, scope-55] -> scope-81 {panel} where the scope-56 no longer exist. What my patch does is, instead of replacing one scope-56 with scope-54, it replaces all three scope-56 with scope-54 was (Author: knoguchi): For my reference (and maybe for the reviewer) attaching how the plan evolved with vertexgroup by the UnionOptimizer before my patch. (using the script from the description) !hadooppf-5375.png|thumbnail! Note the bottom of the vertex group output {panel} Tez vertex group scope-87 <- [{color:#FF}scope-56{color}, scope-53, {color:#FF}scope-56{color}, scope-54, scope-55] -> scope-81 {panel} where the scope-56 no longer exist. What my patch does is, instead of replacing one scope-56 with scope-54, it replaces all three scope-56 with scope-54 > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5375-v1.patch, pig-5375.png > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at >
[jira] [Updated] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5375: -- Attachment: hadooppf-5375.pdf For my reference (and maybe for the reviewer) attaching how the plan evolved with vertexgroup by the UnionOptimizer before my patch. (using the script from the description) !hadooppf-5375.pdf|thumbnail! Note the bottom of the vertex group output {panel} Tez vertex group scope-87 <- [{color:#FF}scope-56{color}, scope-53, {color:#FF}scope-56{color}, scope-54, scope-55] -> scope-81 {panel} where the scope-56 no longer exist. What my patch does is, instead of replacing one scope-56 with scope-54, it replaces all three scope-56 with scope-54 > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: hadooppf-5375.pdf, pig-5375-v1.patch > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:197) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > ... 12 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5375: -- Attachment: hadooppf-5375.png > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: hadooppf-5375.pdf, hadooppf-5375.png, pig-5375-v1.patch > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:197) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > ... 12 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5375: -- Attachment: pig-5375.png > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5375-v1.patch, pig-5375.png > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:197) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > ... 12 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5375: -- Attachment: (was: hadooppf-5375.png) > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5375-v1.patch > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:197) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > ... 12 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744494#comment-16744494 ] Koji Noguchi edited comment on PIG-5375 at 1/16/19 10:07 PM: - For my reference (and maybe for the reviewer) attaching how the plan evolved with vertexgroup by the UnionOptimizer before my patch. (using the script from the description) !hadooppf-5375.png|thumbnail! Note the bottom of the vertex group output {panel} Tez vertex group scope-87 <- [{color:#FF}scope-56{color}, scope-53, {color:#FF}scope-56{color}, scope-54, scope-55] -> scope-81 {panel} where the scope-56 no longer exist. What my patch does is, instead of replacing one scope-56 with scope-54, it replaces all three scope-56 with scope-54 was (Author: knoguchi): For my reference (and maybe for the reviewer) attaching how the plan evolved with vertexgroup by the UnionOptimizer before my patch. (using the script from the description) !hadooppf-5375.pdf|thumbnail! Note the bottom of the vertex group output {panel} Tez vertex group scope-87 <- [{color:#FF}scope-56{color}, scope-53, {color:#FF}scope-56{color}, scope-54, scope-55] -> scope-81 {panel} where the scope-56 no longer exist. What my patch does is, instead of replacing one scope-56 with scope-54, it replaces all three scope-56 with scope-54 > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5375-v1.patch > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at
[jira] [Updated] (PIG-5375) NullPointerException for multi-level self unions with Tez UnionOptimizer
[ https://issues.apache.org/jira/browse/PIG-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5375: -- Attachment: (was: hadooppf-5375.pdf) > NullPointerException for multi-level self unions with Tez UnionOptimizer > > > Key: PIG-5375 > URL: https://issues.apache.org/jira/browse/PIG-5375 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5375-v1.patch > > > {code} > A = load 'input.txt' as (a0:int, a1: chararray, a2:int); > B = load 'input.txt' as (a0:int, a1: chararray, a2:int); > C = load 'input.txt' as (a0:int, a1: chararray, a2:int); > A_and_B = UNION A, B; > SPLIT A_and_B INTO A_and_B2 IF a0 > 10, A_and_B3 OTHERWISE; > A_and_B_and_C = UNION ONSCHEMA C, A_and_B; > X = UNION ONSCHEMA A_and_B_and_C, A_and_B2, A_and_B3; > X2 = GROUP X ALL ; > dump X2; > {code} > This fails _on Tez_ with > {noformat} > Pig Stack Trace > --- > ERROR 1002: Unable to store alias X2 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias X2 > at org.apache.pig.PigServer.openIterator(PigServer.java:1024) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:790) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:630) > at org.apache.pig.Main.main(Main.java:175) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias X2 > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > ... 7 more > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:296) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > ... 9 more > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.patchPackage(TezPOPackageAnnotator.java:97) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.handlePackage(TezPOPackageAnnotator.java:78) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezPOPackageAnnotator.visitTezOp(TezPOPackageAnnotator.java:61) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:265) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:197) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > ... 12 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (37 issues) Subscriber: pigdaily Key Summary PIG-5377Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce https://issues.apache.org/jira/browse/PIG-5377 PIG-5369Add llap-client dependency https://issues.apache.org/jira/browse/PIG-5369 PIG-5360Pig sets working directory of input file systems causes exception thrown https://issues.apache.org/jira/browse/PIG-5360 PIG-5338Prevent deep copy of DataBag into Jython List https://issues.apache.org/jira/browse/PIG-5338 PIG-5323Implement LastInputStreamingOptimizer in Tez https://issues.apache.org/jira/browse/PIG-5323 PIG-5273_SUCCESS file should be created at the end of the job https://issues.apache.org/jira/browse/PIG-5273 PIG-5267Review of org.apache.pig.impl.io.BufferedPositionedInputStream https://issues.apache.org/jira/browse/PIG-5267 PIG-5256Bytecode generation for POFilter and POForeach https://issues.apache.org/jira/browse/PIG-5256 PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown NPE in multithread env https://issues.apache.org/jira/browse/PIG-5160 PIG-5115Builtin AvroStorage generates incorrect avro schema when the same pig field name appears in the alias https://issues.apache.org/jira/browse/PIG-5115 PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true https://issues.apache.org/jira/browse/PIG-5106 PIG-5081Can not run pig on spark source code distribution https://issues.apache.org/jira/browse/PIG-5081 PIG-5080Support store alias as spark table https://issues.apache.org/jira/browse/PIG-5080 PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput https://issues.apache.org/jira/browse/PIG-5057 PIG-5029Optimize sort case when data is skewed https://issues.apache.org/jira/browse/PIG-5029 PIG-4926Modify the content of start.xml for spark mode https://issues.apache.org/jira/browse/PIG-4926 PIG-4913Reduce jython function initiation during compilation https://issues.apache.org/jira/browse/PIG-4913 PIG-4849pig on tez will cause tez-ui to crash,because the content from timeline server is too long. https://issues.apache.org/jira/browse/PIG-4849 PIG-4750REPLACE_MULTI should compile Pattern once and reuse it https://issues.apache.org/jira/browse/PIG-4750 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues.apache.org/jira/browse/PIG-4656 PIG-4598Allow user defined plan optimizer rules https://issues.apache.org/jira/browse/PIG-4598 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues.apache.org/jira/browse/PIG-4539 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues.apache.org/jira/browse/PIG-4515 PIG-4373Implement PIG-3861 in Tez https://issues.apache.org/jira/browse/PIG-4373 PIG-4323PackageConverter hanging in Spark https://issues.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues.apache.org/jira/browse/PIG-4313 PIG-4251Pig on Storm https://issues.apache.org/jira/browse/PIG-4251 PIG-4002Disable combiner when map-side aggregation is used https://issues.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues.apache.org/jira/browse/PIG-3911 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues.apache.org/jira/browse/PIG-3873 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 PIG-1804Alow Jython function to implement Algebraic and/or Accumulator interfaces https://issues.apache.org/jira/browse/PIG-1804 You may edit this subscription at: https://issues.apache.org/jira/secure/EditSubscription!default.jspa?subId=16328=12322384