[jira] [Commented] (PIG-5386) Pig local mode with bundled Hadoop broken
[ https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873028#comment-16873028 ] Nandor Kollar commented on PIG-5386: Committed to trunk, thanks Rohini for the review! > Pig local mode with bundled Hadoop broken > - > > Key: PIG-5386 > URL: https://issues.apache.org/jira/browse/PIG-5386 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5386_1.patch > > > After compiling Pig, local mode doesn't work without installing hadoop > (expected to use the bundled hadoop), because commons-lang is not copied to > h2 folder (just commons-lang3, but bundled requires commons-lang too) > I think it was broken by PIG-5317 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken
[ https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5386: --- Fix Version/s: 0.18.0 > Pig local mode with bundled Hadoop broken > - > > Key: PIG-5386 > URL: https://issues.apache.org/jira/browse/PIG-5386 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5386_1.patch > > > After compiling Pig, local mode doesn't work without installing hadoop > (expected to use the bundled hadoop), because commons-lang is not copied to > h2 folder (just commons-lang3, but bundled requires commons-lang too) > I think it was broken by PIG-5317 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken
[ https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5386: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Pig local mode with bundled Hadoop broken > - > > Key: PIG-5386 > URL: https://issues.apache.org/jira/browse/PIG-5386 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5386_1.patch > > > After compiling Pig, local mode doesn't work without installing hadoop > (expected to use the bundled hadoop), because commons-lang is not copied to > h2 folder (just commons-lang3, but bundled requires commons-lang too) > I think it was broken by PIG-5317 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5387: --- Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844757#comment-16844757 ] Nandor Kollar commented on PIG-5387: Committed to trunk, thanks Rohini, Adam and Koji for the reviews! > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5387: --- Attachment: PIG-5387_3.patch > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5388) Upgrade to Avro 1.9.x
Nandor Kollar created PIG-5388: -- Summary: Upgrade to Avro 1.9.x Key: PIG-5388 URL: https://issues.apache.org/jira/browse/PIG-5388 Project: Pig Issue Type: Improvement Reporter: Nandor Kollar -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838374#comment-16838374 ] Nandor Kollar commented on PIG-5387: Attached PIG-5387_2.patch with a comment and removal of registerNewResource from TestPigServerLocal. It seemed to me that registering temp folder is unnecessary in this test case, it is required only in TestPigServer. > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5387: --- Attachment: PIG-5387_2.patch > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838353#comment-16838353 ] Nandor Kollar commented on PIG-5387: [~knoguchi], [~rohini] the class loading related changes is an ugly hack which I'm not proud of, but I didn't find any easy fix which is compatible with Java 11 as well as Java 8, and this one seems to work. Fortunately it is only a test problem, not a production code issue. Until Java 8 the system class loader ([AppClassLoader|https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/sun/misc/Launcher.java#L259]) extends URLClassLoader, thus casting and registering the jar at runtime works fine. This has changes in Java 11 (I think since Java 9, but didn't verify this): system class loader no longer extends URLClassLoader (see [here|https://github.com/openjdk/jdk11u/blob/master/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java#L151]). I tried to create a new URLClassLoader and set the Thread's context class loader to it, but Pig uses [getSystemResources|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/PigServer.java#L575] to locate the jars, which will ask system classloader, for the locations, and won't find resources registered below it. That's why I came up with this hack: on Java 8 it will work as before, on Java 11, it will register the new URLClassLoader as the parent of system class loader, and will find the jars. I know that this is an ugly hack, I guess the correct answer to this problem would be Java 11 module loading, but to remain compatible with Java 8 we should either introduce several reflective invocations, or we should introduce something like Java 11 shim. I don't think either option pays off for a single test fix like this one. The only incompatibility in this case is the casting of system class loader to URLClassLoader, and since we don't do it in any other place apart from these two test classes, I don't think it causes any issue with registering commands in Pig. Refactoring registerNewResource to Utils class is a good idea, as well as adding a comment which briefly describes the aforementioned situation. > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131].
[jira] [Updated] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5387: --- Status: Patch Available (was: Open) > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5387: --- Attachment: PIG-5387_1.patch > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5387) Test failures on JRE 11
Nandor Kollar created PIG-5387: -- Summary: Test failures on JRE 11 Key: PIG-5387 URL: https://issues.apache.org/jira/browse/PIG-5387 Project: Pig Issue Type: Bug Affects Versions: 0.17.0 Reporter: Nandor Kollar I tried to compile Pig with JDK 8 and execute the test with Java 11, and faced with several test failures. For example TestCommit#testCheckin2 failed with the following exception: {code} 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner (LocalJobRunner.java:run(590)) - job_local1000317333_0003 java.lang.Exception: java.io.IOException: Deserialization error: null at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) Caused by: java.io.IOException: Deserialization error: null at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.lang.NullPointerException at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) at java.base/java.util.HashMap.hash(HashMap.java:339) at java.base/java.util.HashMap.readObject(HashMap.java:1461) at java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) {code} It deserialization of one of the map plan failed, it appears we ran into [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that the workaround in the issue report works, adding a readObject method to org.apache.pig.impl.plan.Operator: {code} private void readObject(ObjectInputStream in) throws ClassNotFoundException, IOException { in.defaultReadObject(); } {code} solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken
[ https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5386: --- Status: Patch Available (was: Open) > Pig local mode with bundled Hadoop broken > - > > Key: PIG-5386 > URL: https://issues.apache.org/jira/browse/PIG-5386 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5386_1.patch > > > After compiling Pig, local mode doesn't work without installing hadoop > (expected to use the bundled hadoop), because commons-lang is not copied to > h2 folder (just commons-lang3, but bundled requires commons-lang too) > I think it was broken by PIG-5317 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken
[ https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5386: --- Attachment: PIG-5386_1.patch > Pig local mode with bundled Hadoop broken > - > > Key: PIG-5386 > URL: https://issues.apache.org/jira/browse/PIG-5386 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5386_1.patch > > > After compiling Pig, local mode doesn't work without installing hadoop > (expected to use the bundled hadoop), because commons-lang is not copied to > h2 folder (just commons-lang3, but bundled requires commons-lang too) > I think it was broken by PIG-5317 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken
[ https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5386: --- Description: After compiling Pig, local mode doesn't work without installing hadoop (expected to use the bundled hadoop), because commons-lang is not copied to h2 folder (just commons-lang3, but bundled requires commons-lang too) I think it was broken by PIG-5317 was:After compiling Pig, local mode doesn't work without installing hadoop (expected to use the bundled hadoop), because commons-lang is not copied to h2 folder (just commons-lang3, but bundled requires commons-lang too) > Pig local mode with bundled Hadoop broken > - > > Key: PIG-5386 > URL: https://issues.apache.org/jira/browse/PIG-5386 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5386_1.patch > > > After compiling Pig, local mode doesn't work without installing hadoop > (expected to use the bundled hadoop), because commons-lang is not copied to > h2 folder (just commons-lang3, but bundled requires commons-lang too) > I think it was broken by PIG-5317 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5386) Pig local mode with bundled Hadoop broken
Nandor Kollar created PIG-5386: -- Summary: Pig local mode with bundled Hadoop broken Key: PIG-5386 URL: https://issues.apache.org/jira/browse/PIG-5386 Project: Pig Issue Type: Bug Affects Versions: 0.17.0 Reporter: Nandor Kollar Assignee: Nandor Kollar After compiling Pig, local mode doesn't work without installing hadoop (expected to use the bundled hadoop), because commons-lang is not copied to h2 folder (just commons-lang3, but bundled requires commons-lang too) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5376) Upgrade Guava version
Nandor Kollar created PIG-5376: -- Summary: Upgrade Guava version Key: PIG-5376 URL: https://issues.apache.org/jira/browse/PIG-5376 Project: Pig Issue Type: Bug Reporter: Nandor Kollar Assignee: Nandor Kollar Pig uses Guava version 11.0, which is on one side very old, on the other side there are CVEs that affect this version (for example CVE-2018-10237) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5374) Use CircularFifoBuffer in InterRecordReader
[ https://issues.apache.org/jira/browse/PIG-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736928#comment-16736928 ] Nandor Kollar commented on PIG-5374: +1 > Use CircularFifoBuffer in InterRecordReader > --- > > Key: PIG-5374 > URL: https://issues.apache.org/jira/browse/PIG-5374 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5374.0.patch > > > We're currently using CircularFifoQueue in InterRecordReader, and it comes > from commons-collections4 dependency. Hadoop 2.8 installations do not have > this dependency by default, so for now we should switch to the older > CircularFifoBuffer instead (which comes from commons-collections and it's > present). > We should open a separate ticket for investigating what libraries should we > update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733121#comment-16733121 ] Nandor Kollar commented on PIG-5373: +1 on PIG-5373.1.patch > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch, PIG-5373.1.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732007#comment-16732007 ] Nandor Kollar commented on PIG-5373: I have one observation to the patch: to be future-proof, instead CircularFifoBuffer from commons-collection I think we should use CircularFifoQueue from commons-collections4. On one hand CircularFifoBuffer was removed from the latest commons collections code, on the other hand CircularFifoQueue is generic, so we can eliminated iterating through Object items and casting to integer. Be aware of one thing: the semantic of isFull has changed, CircularFifoQueue is never full. The isFull call should be replaced to {{queue.size() == queue.maxSize()}}. > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, > PIG-5317_without_new_dep_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650021#comment-16650021 ] Nandor Kollar commented on PIG-5317: Thanks Rohini, Koji and Satish! I mark this Jira as resolved, and in case there are still test failures due to this one, feel free to reopen it. > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, > PIG-5317_without_new_dep_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621847#comment-16621847 ] Nandor Kollar commented on PIG-5317: Thanks [~satishsaley], I think the one line change in the build.xml in PIG-5317_without_new_dep_2.patch should fix this problem. Since I can't easily create a cluster with Tez installed, could you please test on your end and tell if it indeed fixes this failure? > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, > PIG-5317_without_new_dep_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: PIG-5317_without_new_dep_2.patch > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, > PIG-5317_without_new_dep_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5358) Remove hive-contrib jar from lib directory
[ https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614713#comment-16614713 ] Nandor Kollar commented on PIG-5358: +1 > Remove hive-contrib jar from lib directory > -- > > Key: PIG-5358 > URL: https://issues.apache.org/jira/browse/PIG-5358 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Minor > Attachments: PIG-5358.0.patch > > > As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We > 'export' some of our Hive dependencies into our lib folder too, and that > includes hive-contrib.jar so in order to be synced with Hive we should remove > it too. > We don't depend on this jar runtime so there's no use of it being in Pig's > lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596467#comment-16596467 ] Nandor Kollar commented on PIG-5191: Thanks Adam, Rohini and Daniel for your reviews! > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PIG-5351) How to execute pig script from java
[ https://issues.apache.org/jira/browse/PIG-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar resolved PIG-5351. Resolution: Not A Problem > How to execute pig script from java > --- > > Key: PIG-5351 > URL: https://issues.apache.org/jira/browse/PIG-5351 > Project: Pig > Issue Type: Wish >Reporter: Atul Raut >Priority: Minor > > How to execute pig script from java class. > I need to submit Pig script from my java application to the Pig server (This > may be on any remote location) and that Pig server will execute that script > and return the result to my java application. > > Following is my source, > public static void main(String[] args) throws Exception { > System.setProperty("hadoop.home.dir", > "/Pig/hadoop-common-2.2.0-bin-master/"); > > Properties props = new Properties(); > props.setProperty("fs.default.name", "hdfs://192.168.102.179:8020"); > props.setProperty("mapred.job.tracker", "192.168.102.179:8021"); > props.setProperty("pig.use.overriden.hadoop.configs", "true"); > PigServer pig = new PigServer(ExecType.MAPREDUCE, props); > pig.debugOn(); > pig.registerScript("A = LOAD '/apps/employee/sample.txt' USING > PigStorage();"); > } > > Thank you in advanced for your support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5351) How to execute pig script from java
[ https://issues.apache.org/jira/browse/PIG-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559673#comment-16559673 ] Nandor Kollar commented on PIG-5351: [~atul.raut] could you please use users (u...@pig.apache.org) mail list to ask questions about Pig. Jira should be used only to raise bug reports/real feature requests for Pig. As of your question, you can interact with via it's CLI, it is not a service (like Hive), as far as I know there's no way to run a Pig "server" remotely, it doesn't expose any remote interface, where you can and submit scripts against it. The code you mention starts Pig locally, and interacts with remote HDFS and Mapreduce. > How to execute pig script from java > --- > > Key: PIG-5351 > URL: https://issues.apache.org/jira/browse/PIG-5351 > Project: Pig > Issue Type: Wish >Reporter: Atul Raut >Priority: Minor > > How to execute pig script from java class. > I need to submit Pig script from my java application to the Pig server (This > may be on any remote location) and that Pig server will execute that script > and return the result to my java application. > > Following is my source, > public static void main(String[] args) throws Exception { > System.setProperty("hadoop.home.dir", > "/Pig/hadoop-common-2.2.0-bin-master/"); > > Properties props = new Properties(); > props.setProperty("fs.default.name", "hdfs://192.168.102.179:8020"); > props.setProperty("mapred.job.tracker", "192.168.102.179:8021"); > props.setProperty("pig.use.overriden.hadoop.configs", "true"); > PigServer pig = new PigServer(ExecType.MAPREDUCE, props); > pig.debugOn(); > pig.registerScript("A = LOAD '/apps/employee/sample.txt' USING > PigStorage();"); > } > > Thank you in advanced for your support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5348) Tests that need a MiniCluster fail
[ https://issues.apache.org/jira/browse/PIG-5348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538735#comment-16538735 ] Nandor Kollar commented on PIG-5348: [~nielsbasjes] ORC related tests the only one which failed? Those failures are caused by PIG-5317, and there's a patch which supposed to fix it and is yet to be reviewed and committed (if you're familiar with ORC then your feedback is really appreciated, I'm not really familiar with it :) ). Could you please check if it indeed makes those test green again? > Tests that need a MiniCluster fail > -- > > Key: PIG-5348 > URL: https://issues.apache.org/jira/browse/PIG-5348 > Project: Pig > Issue Type: Bug >Reporter: Niels Basjes >Priority: Major > > While working on PIG-2599 and PIG-5343 I found that some tests always fail. > A good example is org.apache.pig.builtin.TestOrcStoragePushdown > The first rough assessment is that the tests that a Hadoop MiniCluster to run > are the ones that fail. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538190#comment-16538190 ] Nandor Kollar commented on PIG-5191: Any feedback about patch the second patch? > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538189#comment-16538189 ] Nandor Kollar commented on PIG-5317: [~rohini] could you please have a look at PIG-5317_without_new_dep.patch? Hope it fixes the failing test cases, and would be nice to make all green again. > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5344) Update Apache HTTPD LogParser to latest version
[ https://issues.apache.org/jira/browse/PIG-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527648#comment-16527648 ] Nandor Kollar commented on PIG-5344: Looks good to me! > Update Apache HTTPD LogParser to latest version > --- > > Key: PIG-5344 > URL: https://issues.apache.org/jira/browse/PIG-5344 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.18.0 >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Attachments: PIG-5344-1.patch > > > Similar to PIG-4717 this is to simply upgrade the > [logparser|https://github.com/nielsbasjes/logparser] library. > I had to postpone this for a while because the latest version requires Java 8. > I will simply update the version of the library. > The new features are supported transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PIG-4822) Spark UT failures after merge from trunk
[ https://issues.apache.org/jira/browse/PIG-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar resolved PIG-4822. Resolution: Fixed > Spark UT failures after merge from trunk > > > Key: PIG-4822 > URL: https://issues.apache.org/jira/browse/PIG-4822 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Pallavi Rao >Priority: Major > Labels: spork > Fix For: spark-branch > > > New failures: > [junit] Tests run: 26, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: > 106.46 sec > [junit] Test org.apache.pig.test.TestEvalPipeline FAILED > -- > [junit] Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 55.261 sec > [junit] Test org.apache.pig.test.TestMultiQuery FAILED > -- > [junit] Tests run: 31, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: > 76.905 sec > [junit] Test org.apache.pig.test.TestPigRunner FAILED > -- > [junit] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 43.832 sec > [junit] Test org.apache.pig.test.TestPigServerLocal FAILED > -- > [junit] Tests run: 71, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: > 73.983 sec > [junit] Test org.apache.pig.test.TestPruneColumn FAILED > -- > [junit] Tests run: 31, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 7.692 sec > [junit] Test org.apache.pig.test.TestSchema FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestScriptLanguage FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestScriptLanguageJavaScript FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestScriptUDF FAILED > -- > [junit] Tests run: 0, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: > 6.21 sec > [junit] Test org.apache.pig.test.TestSkewedJoin FAILED > -- > [junit] Tests run: 0, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: > 6.399 sec > [junit] Test org.apache.pig.test.TestSplitStore FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestStore FAILED > -- > [junit] Tests run: 0, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: > 5.11 sec > [junit] Test org.apache.pig.test.TestStoreInstances FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestStoreOld FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestStreaming FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestToolsPigServer FAILED > -- > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 > sec > [junit] Test org.apache.pig.test.TestUDF FAILED -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PIG-5122) data
[ https://issues.apache.org/jira/browse/PIG-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar resolved PIG-5122. Resolution: Not A Problem > data > > > Key: PIG-5122 > URL: https://issues.apache.org/jira/browse/PIG-5122 > Project: Pig > Issue Type: Bug > Components: data >Affects Versions: 0.16.0 >Reporter: muhammad hamdani >Priority: Major > Fix For: site > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PIG-5162) Fix failing e2e tests with spark exec type
[ https://issues.apache.org/jira/browse/PIG-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar resolved PIG-5162. Resolution: Fixed Fix Version/s: 0.17.0 > Fix failing e2e tests with spark exec type > -- > > Key: PIG-5162 > URL: https://issues.apache.org/jira/browse/PIG-5162 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Priority: Major > Fix For: spark-branch, 0.17.0 > > > Tests were executed on spark branch in spark mode, old Pig was also Pig on > spark branch (same), but executed in mapreduce mode -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-4632) UT TestSplitCombine.test11 failed with unexpected end of schema when Parquet is 1.6.0+
[ https://issues.apache.org/jira/browse/PIG-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515483#comment-16515483 ] Nandor Kollar commented on PIG-4632: This issue was fixed on trunk as part of PIG-4092, however it is not backported to 0.15 branch. > UT TestSplitCombine.test11 failed with unexpected end of schema when Parquet > is 1.6.0+ > -- > > Key: PIG-4632 > URL: https://issues.apache.org/jira/browse/PIG-4632 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0 >Reporter: Xiang Li >Assignee: Xiang Li >Priority: Minor > > unexpected end of schema > java.lang.IllegalArgumentException: unexpected end of schema > at > parquet.schema.MessageTypeParser$Tokenizer.nextToken(MessageTypeParser.java:62) > at parquet.schema.MessageTypeParser.parse(MessageTypeParser.java:89) > at > parquet.schema.MessageTypeParser.parseMessageType(MessageTypeParser.java:82) > at parquet.hadoop.ParquetInputSplit.end(ParquetInputSplit.java:96) > at parquet.hadoop.ParquetInputSplit.(ParquetInputSplit.java:92) > at > org.apache.pig.test.TestSplitCombine.test11(TestSplitCombine.java:528) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512108#comment-16512108 ] Nandor Kollar commented on PIG-5191: [~szita], [~daijy], [~rohini] do you think patch #2 looks good and is ready to be committed? > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511189#comment-16511189 ] Nandor Kollar commented on PIG-5191: Updated patch with latest HBase dependency (2.0.0 is released). Apart from additional dependencies, one minor change was required in the HBase related test cases: setting {{hbase.localcluster.assign.random.ports}} property (added a comment to the source file). Test downstream, passed without need to modify the bin/pig script. > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5191: --- Attachment: PIG-5191_2.patch > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5191: --- Attachment: (was: PIG-5191_2.patch) > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5191: --- Attachment: PIG-5191_2.patch > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431929#comment-16431929 ] Nandor Kollar commented on PIG-4092: [~rohini] I upgraded the Parquet to the latest and the constructor of ParquetInputSplit [parses the schema | https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L95], thus invalid schema is not allowed any more. I couldn't find the old code, but the constructor just set the private schema field before, it didn't parse the given string. I guess in this test case we can use just a simple dummy schema, it is not important for the test. > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch, PIG-4092_2.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-4092: --- Status: Patch Available (was: Reopened) > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch, PIG-4092_2.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-4092: --- Attachment: PIG-4092_2.patch > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch, PIG-4092_2.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-4092: --- Attachment: PIG-4092_1.patch > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-4092: --- Attachment: (was: PIG-4092_1.patch) > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425920#comment-16425920 ] Nandor Kollar commented on PIG-4092: I'm not sure if anyone uses this wrapper instead of the loader already present in Parquet, but I implemented the missing predicate pushdown interface and the delegation to the Parquet loader. I also updated the parquet-pig-bundle version to the latest one. > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet
[ https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-4092: --- Attachment: PIG-4092_1.patch > Predicate pushdown for Parquet > -- > > Key: PIG-4092 > URL: https://issues.apache.org/jira/browse/PIG-4092 > Project: Pig > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-4092_1.patch > > > See: > https://github.com/apache/incubator-parquet-mr/pull/4 > and: > https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java > [~alexlevenson] is the main author of this API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340799#comment-16340799 ] Nandor Kollar commented on PIG-5253: bq. Can't the hadoop 2 compiled pig.jar be directly used against the Hadoop 3 cluster? I think we can. I'll update my patch so that it doesn't compile against Hadoop 3 and Hadoop 2 bq. hadoop-site.xml was a Hadoop 1.x and pre-YARN thing Ah ok, I see. I'll get rid of hadoop-site.xml then from MiniCluster.java. I also noticed, that in HExecutionEngine also has reference to hadoop-site.xml, should I delete that reference too? I'll upload an updated patch soon. > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Status: Patch Available (was: Reopened) > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on PIG-5253 started by Nandor Kollar. -- > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar reassigned PIG-5253: -- Assignee: Nandor Kollar (was: Adam Szita) > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313277#comment-16313277 ] Nandor Kollar commented on PIG-5317: Attached PIG-5317_without_new_dep.patch, without new dependencies, but changed thresholds for the failing cases. [~rohini] could you please help with a review? The new numbers are empirically deduced, not sure if they are "correct", not really familiar with ORC predicate pushdown. > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: PIG-5317_without_new_dep.patch > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, > PIG-5317_amend.patch, PIG-5317_without_new_dep.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311379#comment-16311379 ] Nandor Kollar commented on PIG-5317: I took a look at the changes in RandomStringUtils, and I think this change is related to LANG-1286. Looks like with this overflow fix changed RandomStringUtils' API too, though I don't know why does this affect us. > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, PIG-5317_amend.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309873#comment-16309873 ] Nandor Kollar commented on PIG-5317: Ouch, it looks TestOrcStoragePushdown fails because of this patch. Attached PIG-5317_amend.patch, the documentation says that RandomStringUtils is deprecated, and RandomStringGenerator should be used instead. Looks like in commons-lang3 the behavior of this deprecated class changed? [~rohini] could you please have a look at PIG-5317_amend.patch? > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, PIG-5317_amend.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: PIG-5317_amend.patch > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch, PIG-5317_amend.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar reopened PIG-5317: > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5317_1.patch, PIG-5317_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308353#comment-16308353 ] Nandor Kollar commented on PIG-5317: [~rohini] what do you think about PIG-5317_2.patch? Do you see any issue that could be caused by upgrading these dependencies? > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5317_1.patch, PIG-5317_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5320: --- Attachment: PIG-5320_2.patch > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch, PIG-5320_2.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289136#comment-16289136 ] Nandor Kollar commented on PIG-5320: I think this is a problem with Spark 1.6.x too, checking for the condition in a loop should solve the problem. I also changed the map and set implementation to sorted one, since we use integer job ids, I hope it would slightly improve performance in case of many jobs. [~kellyzly], [~szita] could you please have a look at my patch? My only concern is: is SparkListener#onJobEnd() called when the job fails? If not, then Pig would stuck in an infinite loop. > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5320: --- Attachment: PIG-5320_1.patch > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5320: --- Attachment: (was: PIG-5320_1.patch) > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5320: --- Status: Patch Available (was: Open) > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5320: --- Attachment: PIG-5320_1.patch > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar reassigned PIG-5320: -- Assignee: Nandor Kollar > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
Nandor Kollar created PIG-5320: -- Summary: TestCubeOperator#testRollupBasic is flaky on Spark 2.2 Key: PIG-5320 URL: https://issues.apache.org/jira/browse/PIG-5320 Project: Pig Issue Type: Bug Components: spark Reporter: Nandor Kollar TestCubeOperator#testRollupBasic occasionally fails with {code} org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias c at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) at org.apache.pig.PigServer.registerQuery(PigServer.java:708) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.PigServer.registerScript(PigServer.java:781) at org.apache.pig.PigServer.registerScript(PigServer.java:858) at org.apache.pig.PigServer.registerScript(PigServer.java:821) at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) at org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get the rdds of this spark operator: at org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) at org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) at org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) at org.apache.pig.PigServer.execute(PigServer.java:1449) at org.apache.pig.PigServer.access$500(PigServer.java:119) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING at org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) at org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) at org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) at org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) at org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) {code} I think the problem is that in JobStatisticCollector#waitForJobToEnd {{sparkListener.wait()}} is not inside a loop, like suggested in wait's javadoc: {code} * As in the one argument version, interrupts and spurious wakeups are * possible, and this method should always be used in a loop: {code} Thus due to a spurious wakeup, the wait might pass without a notify getting called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16285800#comment-16285800 ] Nandor Kollar commented on PIG-5317: [~rohini] the database path is the same, didn't change: {code} dbServer.setDatabasePath(0, "file:" + TMP_DIR + "batchtest;"+ "hsqldb.default_table_type=cached;hsqldb.cache_rows=100;sql.enforce_strict_size=true"); {code} I added the caching to this path now in PIG-5317_2.patch. > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5317_1.patch, PIG-5317_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: PIG-5317_2.patch > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5317_1.patch, PIG-5317_2.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: PIG-5317_1.patch > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: PIG-5317_1.patch > > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: (was: PIG-5317_1.patch) > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging
[ https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5317: --- Attachment: PIG-5317_1.patch > Upgrade old dependencies: commons-lang, hsqldb, commons-logging > --- > > Key: PIG-5317 > URL: https://issues.apache.org/jira/browse/PIG-5317 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > > Pig depends on old version of commons-lang, hsqldb and commons-logging. It > would be nice to upgrade the version of these dependencies, for commons-lang > Pig should depend on commons-lang3 instead (which is already present in the > ivy.xml) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283321#comment-16283321 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_6.patch, found an universal way to tell the current Spark version, that works with both Spark 1.6.x and Spark 2.x too, and there's no need to start SparkContext. (thanks [~gezapeti] :) ) > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_6.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283255#comment-16283255 ] Nandor Kollar commented on PIG-5318: [~kellyzly] thanks for the explanation, in this case I think enabling this test is fine, and there's no need to check for Spark version, we don't support older Spark versions. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281815#comment-16281815 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_5.patch which includes fix for TestAssert, TestScalarAliases, TestEvalPipeline2, TestStore and TestStoreLocal test cases, but doesn't fix TestStoreInstances failure. The Spark version is determined like Rohini suggested. I also noticed, that testKeepGoigFailed (fixed the typo in method name, now testKeepGoingFailed) was excluded from spark exec type, I enabled this test case, since it passed in my environment with 1.6, 2.1 and 2.2 Spark versions. [~kellyzly] do you remember why this was excluded? Looks like the Jira it is referring to is not yet fixed, despite this the test passes with 1.6.x Spark. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_5.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5319) Investigate why TestStoreInstances fails with Spark 2.2
Nandor Kollar created PIG-5319: -- Summary: Investigate why TestStoreInstances fails with Spark 2.2 Key: PIG-5319 URL: https://issues.apache.org/jira/browse/PIG-5319 Project: Pig Issue Type: Bug Components: spark Reporter: Nandor Kollar TestStoreInstances unit test fails with Spark 2.2.x. It seems in job and task commit logic changed a lot since Spark 2.1.x, now it looks like Spark uses a different PigOutputFormat when writing to files, and a different one when getting the OutputCommitters -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276747#comment-16276747 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_4.patch, it looks like the way I wanted to tell Spark version doesn't work on Spark 1.x, using SparkContext#version instead. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_4.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_3.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276557#comment-16276557 ] Nandor Kollar commented on PIG-5318: bq. You should just do isSpark2_x (sparkVersion.startsWith("2.")) instead of isSpark2_2_x . If Spark 2.3 gets released, then code will have to change. You're right, but matching for 2.x is not good enough. On Spark 2.1, abortTask and abortJob is not called (see SPARK-7953), but looks like in Spark 2.2 this is fixed (at least it looks like it is fixed). I'll update the patch soon, we should match Spark 2.2+. bq. Spark should consistently use the same OutputFormat instance in this case Ok, so I guess this should be a new Jira for Spark, however Spark 2.2 is already released, and creates more OutputFormat instances like said before. Indeed, we shouldn't modify the test case, but how about modifying PigOutputformat, like I did in the patch (making the relevant variables static)? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274385#comment-16274385 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_2.patch, I addressed Rohini's comments there. As of {{TestStoreInstances}} failure, it looks like Spark (unlike Tez and MapReduce) creates multiple instances from {{PigOutputFormat}} while setting up the output committers: [setupCommitter|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L74] is called from both [setupJob|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L138] and from [setupTask|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L165], and {{setupCommitter}} creates a new {{PigOutputFormat}} each time, saving in a private variable. In addition, when Spark writes to files, a new {{PigOutputFormat}} is [getting created|https://github.com/apache/spark/blob/branch-2.2/core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala#L75] too, and since POStores are saved and deserialized in configuration, but StoreFuncInterface inside stores are [transient|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java#L53], a new instance of {{STFuncCheckInstances}} is getting created, each time, thus {{putNext}} and {{commitTask}} will use different array instances. Not sure if it is a bug in Pig, or in Spark, should Spark consistently use the same OutputFormat instance in this case? Making {{reduceStores}}, {{mapStores}}, {{currentConf}} static inside {{TestStoreInstances}} would solve the problem, [~rohini], [~kellyzly] what do you think about this solution? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_2.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272501#comment-16272501 ] Nandor Kollar commented on PIG-5318: Thanks [~rohini] and [~kellyzly] for your review! Hm, I think I understood the point of TestStoreInstances, and indeed, my change on that test looks pointless. I'm afraid this might be a bug and not a test issue. I'll continue the investigation why it is failing, and what how to fix it, so far it looks like commitTask is not called on the correct OutputCommitterTestInstances instance, the array is empty. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: (was: PIG-5318_2.patch) > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_2.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: (was: PIG-5318_2.patch) > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_2.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Description: There are several failing cases when executing the unit tests with Spark 2.2: {code} org.apache.pig.test.TestAssert#testNegativeWithoutFetch org.apache.pig.test.TestAssert#testNegative org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore {code} All of these are related to fixes/changes in Spark. TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed by asserting on the message of the exception's root cause, looks like on Spark 2.2 the exception is wrapped into an additional layer. TestStore and TestStoreLocal failure are also a test related problems: looks like SPARK-7953 is fixed in Spark 2.2 The root cause of TestStoreInstances is yet to be found out. was: There are sever failing cases when executing the unit tests with Spark 2.2: {code} org.apache.pig.test.TestAssert#testNegativeWithoutFetch org.apache.pig.test.TestAssert#testNegative org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore {code} All of these are related to fixes/changes in Spark. TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed by asserting on the message of the exception's root cause, looks like on Spark 2.2 the exception is wrapped into an additional layer. TestStore and TestStoreLocal failure are also a test related problems: looks like SPARK-7953 is fixed in Spark 2.2 The root cause of TestStoreInstances is yet to be found out. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270990#comment-16270990 ] Nandor Kollar commented on PIG-5318: [~szita], [~kellyzly] could you please have review? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are sever failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Status: Patch Available (was: Open) > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are sever failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5318: --- Attachment: PIG-5318_1.patch > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are sever failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5316: --- Attachment: PIG-5316_2.patch > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5316: --- Attachment: (was: PIG-5316_2.patch) > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
Nandor Kollar created PIG-5318: -- Summary: Unit test failures on Pig on Spark with Spark 2.2 Key: PIG-5318 URL: https://issues.apache.org/jira/browse/PIG-5318 Project: Pig Issue Type: Bug Components: spark Reporter: Nandor Kollar Assignee: Nandor Kollar There are sever failing cases when executing the unit tests with Spark 2.2: {code} org.apache.pig.test.TestAssert#testNegativeWithoutFetch org.apache.pig.test.TestAssert#testNegative org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore {code} All of these are related to fixes/changes in Spark. TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed by asserting on the message of the exception's root cause, looks like on Spark 2.2 the exception is wrapped into an additional layer. TestStore and TestStoreLocal failure are also a test related problems: looks like SPARK-7953 is fixed in Spark 2.2 The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5316: --- Status: Patch Available (was: Reopened) > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268832#comment-16268832 ] Nandor Kollar commented on PIG-5316: Looks like we can't create TaskAttemptID with default constructor, should use the one which gets parameters to avoid NPEs on Hadoop 2.x. Using HadoopShims#getNewTaskAttemptID should solve this, attached PIG-5316_2.patch. > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-5316: --- Attachment: PIG-5316_2.patch > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)