[jira] [Commented] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-06-26 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873028#comment-16873028
 ] 

Nandor Kollar commented on PIG-5386:


Committed to trunk, thanks Rohini for the review!

> Pig local mode with bundled Hadoop broken
> -
>
> Key: PIG-5386
> URL: https://issues.apache.org/jira/browse/PIG-5386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5386_1.patch
>
>
> After compiling Pig, local mode doesn't work without installing hadoop 
> (expected to use the bundled hadoop), because commons-lang is not copied to 
> h2 folder (just commons-lang3, but bundled requires commons-lang too)
> I think it was broken by PIG-5317



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-06-26 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5386:
---
Fix Version/s: 0.18.0

> Pig local mode with bundled Hadoop broken
> -
>
> Key: PIG-5386
> URL: https://issues.apache.org/jira/browse/PIG-5386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5386_1.patch
>
>
> After compiling Pig, local mode doesn't work without installing hadoop 
> (expected to use the bundled hadoop), because commons-lang is not copied to 
> h2 folder (just commons-lang3, but bundled requires commons-lang too)
> I think it was broken by PIG-5317



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-06-26 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5386:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Pig local mode with bundled Hadoop broken
> -
>
> Key: PIG-5386
> URL: https://issues.apache.org/jira/browse/PIG-5386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5386_1.patch
>
>
> After compiling Pig, local mode doesn't work without installing hadoop 
> (expected to use the bundled hadoop), because commons-lang is not copied to 
> h2 folder (just commons-lang3, but bundled requires commons-lang too)
> I think it was broken by PIG-5317



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5387) Test failures on JRE 11

2019-05-21 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5387:
---
   Resolution: Fixed
Fix Version/s: 0.18.0
   Status: Resolved  (was: Patch Available)

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5387) Test failures on JRE 11

2019-05-21 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844757#comment-16844757
 ] 

Nandor Kollar commented on PIG-5387:


Committed to trunk, thanks Rohini, Adam and Koji for the reviews!

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5387) Test failures on JRE 11

2019-05-16 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5387:
---
Attachment: PIG-5387_3.patch

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PIG-5388) Upgrade to Avro 1.9.x

2019-05-16 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5388:
--

 Summary: Upgrade to Avro 1.9.x
 Key: PIG-5388
 URL: https://issues.apache.org/jira/browse/PIG-5388
 Project: Pig
  Issue Type: Improvement
Reporter: Nandor Kollar






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5387) Test failures on JRE 11

2019-05-13 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838374#comment-16838374
 ] 

Nandor Kollar commented on PIG-5387:


Attached PIG-5387_2.patch with a comment and removal of registerNewResource 
from TestPigServerLocal. It seemed to me that registering temp folder is 
unnecessary in this test case, it is required only in TestPigServer.

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch, PIG-5387_2.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5387) Test failures on JRE 11

2019-05-13 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5387:
---
Attachment: PIG-5387_2.patch

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch, PIG-5387_2.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5387) Test failures on JRE 11

2019-05-13 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838353#comment-16838353
 ] 

Nandor Kollar commented on PIG-5387:


[~knoguchi], [~rohini] the class loading related changes is an ugly hack which 
I'm not proud of, but I didn't find any easy fix which is compatible with Java 
11 as well as Java 8, and this one seems to work. Fortunately it is only a test 
problem, not a production code issue. Until Java 8 the system class loader 
([AppClassLoader|https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/sun/misc/Launcher.java#L259])
 extends URLClassLoader, thus casting and registering the jar at runtime works 
fine. This has changes in Java 11 (I think since Java 9, but didn't verify 
this): system class loader no longer extends URLClassLoader (see 
[here|https://github.com/openjdk/jdk11u/blob/master/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java#L151]).
 I tried to create a new URLClassLoader and set the Thread's context class 
loader to it, but Pig uses 
[getSystemResources|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/PigServer.java#L575]
 to locate the jars, which will ask system classloader, for the locations, and 
won't find resources registered below it. That's why I came up with this hack: 
on Java 8 it will work as before, on Java 11, it will register the new 
URLClassLoader as the parent of system class loader, and will find the jars. I 
know that this is an ugly hack, I guess the correct answer to this problem 
would be Java 11 module loading, but to remain compatible with Java 8 we should 
either introduce several reflective invocations, or we should introduce 
something like Java 11 shim. I don't think either option pays off for a single 
test fix like this one. The only incompatibility in this case is the casting of 
system class loader to URLClassLoader, and since we don't do it in any other 
place apart from these two test classes, I don't think it causes any issue with 
registering commands in Pig. Refactoring registerNewResource to Utils class is 
a good idea, as well as adding a comment which briefly describes the 
aforementioned situation.

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. 

[jira] [Updated] (PIG-5387) Test failures on JRE 11

2019-05-10 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5387:
---
Status: Patch Available  (was: Open)

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5387) Test failures on JRE 11

2019-05-10 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5387:
---
Attachment: PIG-5387_1.patch

> Test failures on JRE 11
> ---
>
> Key: PIG-5387
> URL: https://issues.apache.org/jira/browse/PIG-5387
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Priority: Major
> Attachments: PIG-5387_1.patch
>
>
> I tried to compile Pig with JDK 8 and execute the test with Java 11, and 
> faced with several test failures. For example TestCommit#testCheckin2 failed 
> with the following exception:
> {code}
> 2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1000317333_0003
> java.lang.Exception: java.io.IOException: Deserialization error: null
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: Deserialization error: null
>   at 
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
>   at java.base/java.util.HashMap.hash(HashMap.java:339)
>   at java.base/java.util.HashMap.readObject(HashMap.java:1461)
>   at 
> java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown 
> Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
> {code}
> It deserialization of one of the map plan failed, it appears we ran into 
> [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
> the workaround in the issue report works, adding a readObject method to 
> org.apache.pig.impl.plan.Operator:
> {code}
> private void readObject(ObjectInputStream in) throws 
> ClassNotFoundException, IOException {
> in.defaultReadObject();
> }
> {code}
> solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PIG-5387) Test failures on JRE 11

2019-05-08 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5387:
--

 Summary: Test failures on JRE 11
 Key: PIG-5387
 URL: https://issues.apache.org/jira/browse/PIG-5387
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.17.0
Reporter: Nandor Kollar


I tried to compile Pig with JDK 8 and execute the test with Java 11, and faced 
with several test failures. For example TestCommit#testCheckin2 failed with the 
following exception:
{code}
2019-05-08 16:06:09,712 WARN  [Thread-108] mapred.LocalJobRunner 
(LocalJobRunner.java:run(590)) - job_local1000317333_0003
java.lang.Exception: java.io.IOException: Deserialization error: null
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: Deserialization error: null
at 
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.NullPointerException
at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106)
at java.base/java.util.HashMap.hash(HashMap.java:339)
at java.base/java.util.HashMap.readObject(HashMap.java:1461)
at 
java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
{code}

It deserialization of one of the map plan failed, it appears we ran into 
[JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that 
the workaround in the issue report works, adding a readObject method to 
org.apache.pig.impl.plan.Operator:
{code}
private void readObject(ObjectInputStream in) throws 
ClassNotFoundException, IOException {
in.defaultReadObject();
}
{code}
solves the problem, however I'm not sure that this is the optimal solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-04-08 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5386:
---
Status: Patch Available  (was: Open)

> Pig local mode with bundled Hadoop broken
> -
>
> Key: PIG-5386
> URL: https://issues.apache.org/jira/browse/PIG-5386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5386_1.patch
>
>
> After compiling Pig, local mode doesn't work without installing hadoop 
> (expected to use the bundled hadoop), because commons-lang is not copied to 
> h2 folder (just commons-lang3, but bundled requires commons-lang too)
> I think it was broken by PIG-5317



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-04-08 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5386:
---
Attachment: PIG-5386_1.patch

> Pig local mode with bundled Hadoop broken
> -
>
> Key: PIG-5386
> URL: https://issues.apache.org/jira/browse/PIG-5386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5386_1.patch
>
>
> After compiling Pig, local mode doesn't work without installing hadoop 
> (expected to use the bundled hadoop), because commons-lang is not copied to 
> h2 folder (just commons-lang3, but bundled requires commons-lang too)
> I think it was broken by PIG-5317



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-04-08 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5386:
---
Description: 
After compiling Pig, local mode doesn't work without installing hadoop 
(expected to use the bundled hadoop), because commons-lang is not copied to h2 
folder (just commons-lang3, but bundled requires commons-lang too)

I think it was broken by PIG-5317

  was:After compiling Pig, local mode doesn't work without installing hadoop 
(expected to use the bundled hadoop), because commons-lang is not copied to h2 
folder (just commons-lang3, but bundled requires commons-lang too)


> Pig local mode with bundled Hadoop broken
> -
>
> Key: PIG-5386
> URL: https://issues.apache.org/jira/browse/PIG-5386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5386_1.patch
>
>
> After compiling Pig, local mode doesn't work without installing hadoop 
> (expected to use the bundled hadoop), because commons-lang is not copied to 
> h2 folder (just commons-lang3, but bundled requires commons-lang too)
> I think it was broken by PIG-5317



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PIG-5386) Pig local mode with bundled Hadoop broken

2019-04-08 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5386:
--

 Summary: Pig local mode with bundled Hadoop broken
 Key: PIG-5386
 URL: https://issues.apache.org/jira/browse/PIG-5386
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.17.0
Reporter: Nandor Kollar
Assignee: Nandor Kollar


After compiling Pig, local mode doesn't work without installing hadoop 
(expected to use the bundled hadoop), because commons-lang is not copied to h2 
folder (just commons-lang3, but bundled requires commons-lang too)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PIG-5376) Upgrade Guava version

2019-01-14 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5376:
--

 Summary: Upgrade Guava version
 Key: PIG-5376
 URL: https://issues.apache.org/jira/browse/PIG-5376
 Project: Pig
  Issue Type: Bug
Reporter: Nandor Kollar
Assignee: Nandor Kollar


Pig uses Guava version 11.0, which is on one side very old, on the other side 
there are CVEs that affect this version (for example CVE-2018-10237)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5374) Use CircularFifoBuffer in InterRecordReader

2019-01-08 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736928#comment-16736928
 ] 

Nandor Kollar commented on PIG-5374:


+1

> Use CircularFifoBuffer in InterRecordReader
> ---
>
> Key: PIG-5374
> URL: https://issues.apache.org/jira/browse/PIG-5374
> Project: Pig
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: PIG-5374.0.patch
>
>
> We're currently using CircularFifoQueue in InterRecordReader, and it comes 
> from commons-collections4 dependency. Hadoop 2.8 installations do not have 
> this dependency by default, so for now we should switch to the older 
> CircularFifoBuffer instead (which comes from commons-collections and it's 
> present).
> We should open a separate ticket for investigating what libraries should we 
> update. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used

2019-01-03 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733121#comment-16733121
 ] 

Nandor Kollar commented on PIG-5373:


+1 on PIG-5373.1.patch

> InterRecordReader might skip records if certain sync markers are used
> -
>
> Key: PIG-5373
> URL: https://issues.apache.org/jira/browse/PIG-5373
> Project: Pig
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: PIG-5373.0.patch, PIG-5373.1.patch
>
>
> Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can 
> happen that sync markers are not identified while reading the interim binary 
> file used to hold data between jobs.
> In such files sync markers are placed upon writing, which later help during 
> reading the data. These are random generated and it seems like that in some 
> rare combinations of markers and data preceding it, they cannot be not found. 
> This can result in reading through all the bytes (looking for the marker) and 
> reaching split end or EOF, and extracting no records at all.
> This symptom is also observable from JobHistory stats, where if a job is 
> affected by this issue, will have tasks that have HDFS_BYTES_READ or 
> FILE_BYTES_READ about equal to the number bytes of the split, but at the same 
> time having MAP_INPUT_RECORDS=0
> One such (test) example is this:
> {code:java}
> marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, 
> 3]{code}
> Due to a bug, such markers whose prefix overlap with the last data chunk are 
> not seen by the reader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used

2019-01-02 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732007#comment-16732007
 ] 

Nandor Kollar commented on PIG-5373:


I have one observation to the patch: to be future-proof, instead 
CircularFifoBuffer from commons-collection I think we should use 
CircularFifoQueue from commons-collections4. On one hand CircularFifoBuffer was 
removed from the latest commons collections code, on the other hand 
CircularFifoQueue is generic, so we can eliminated iterating through Object 
items and casting to integer. Be aware of one thing: the semantic of isFull has 
changed, CircularFifoQueue is never full. The isFull call should be replaced to 
{{queue.size() == queue.maxSize()}}.

> InterRecordReader might skip records if certain sync markers are used
> -
>
> Key: PIG-5373
> URL: https://issues.apache.org/jira/browse/PIG-5373
> Project: Pig
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: PIG-5373.0.patch
>
>
> Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can 
> happen that sync markers are not identified while reading the interim binary 
> file used to hold data between jobs.
> In such files sync markers are placed upon writing, which later help during 
> reading the data. These are random generated and it seems like that in some 
> rare combinations of markers and data preceding it, they cannot be not found. 
> This can result in reading through all the bytes (looking for the marker) and 
> reaching split end or EOF, and extracting no records at all.
> This symptom is also observable from JobHistory stats, where if a job is 
> affected by this issue, will have tasks that have HDFS_BYTES_READ or 
> FILE_BYTES_READ about equal to the number bytes of the split, but at the same 
> time having MAP_INPUT_RECORDS=0
> One such (test) example is this:
> {code:java}
> marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, 
> 3]{code}
> Due to a bug, such markers whose prefix overlap with the last data chunk are 
> not seen by the reader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-10-15 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, 
> PIG-5317_without_new_dep_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-10-15 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650021#comment-16650021
 ] 

Nandor Kollar commented on PIG-5317:


Thanks Rohini, Koji and Satish! I mark this Jira as resolved, and in case there 
are still test failures due to this one, feel free to reopen it.

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, 
> PIG-5317_without_new_dep_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-09-20 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621847#comment-16621847
 ] 

Nandor Kollar commented on PIG-5317:


Thanks [~satishsaley], I think the one line change in the build.xml in 
PIG-5317_without_new_dep_2.patch should fix this problem. Since I can't easily 
create a cluster with Tez installed, could you please test on your end and tell 
if it indeed fixes this failure?

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, 
> PIG-5317_without_new_dep_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-09-20 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: PIG-5317_without_new_dep_2.patch

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch, 
> PIG-5317_without_new_dep_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5358) Remove hive-contrib jar from lib directory

2018-09-14 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614713#comment-16614713
 ] 

Nandor Kollar commented on PIG-5358:


+1

> Remove hive-contrib jar from lib directory
> --
>
> Key: PIG-5358
> URL: https://issues.apache.org/jira/browse/PIG-5358
> Project: Pig
>  Issue Type: Improvement
>  Components: build
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Minor
> Attachments: PIG-5358.0.patch
>
>
> As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We 
> 'export' some of our Hive dependencies into our lib folder too, and that 
> includes hive-contrib.jar so in order to be synced with Hive we should remove 
> it too.
> We don't depend on this jar runtime so there's no use of it being in Pig's 
> lib dir anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support

2018-08-29 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596467#comment-16596467
 ] 

Nandor Kollar commented on PIG-5191:


Thanks Adam, Rohini and Daniel for your reviews!

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch, PIG-5191_2.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PIG-5351) How to execute pig script from java

2018-07-27 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar resolved PIG-5351.

Resolution: Not A Problem

> How to execute pig script from java
> ---
>
> Key: PIG-5351
> URL: https://issues.apache.org/jira/browse/PIG-5351
> Project: Pig
>  Issue Type: Wish
>Reporter: Atul Raut
>Priority: Minor
>
> How to execute pig script from java class.
> I need to submit Pig script from my java application to the Pig server (This 
> may be on any remote location) and that Pig server will execute that script 
> and return the result to my java application.
>  
> Following is my source,
> public static void main(String[] args) throws Exception {
>  System.setProperty("hadoop.home.dir", 
> "/Pig/hadoop-common-2.2.0-bin-master/");
>  
>  Properties props = new Properties();
>  props.setProperty("fs.default.name", "hdfs://192.168.102.179:8020");
>  props.setProperty("mapred.job.tracker", "192.168.102.179:8021");
>  props.setProperty("pig.use.overriden.hadoop.configs", "true");
> PigServer pig = new PigServer(ExecType.MAPREDUCE, props); 
>  pig.debugOn();
> pig.registerScript("A = LOAD '/apps/employee/sample.txt' USING 
> PigStorage();");
> }
>  
> Thank you in advanced for your support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5351) How to execute pig script from java

2018-07-27 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559673#comment-16559673
 ] 

Nandor Kollar commented on PIG-5351:


[~atul.raut] could you please use users (u...@pig.apache.org) mail list to ask 
questions about Pig. Jira should be used only to raise bug reports/real feature 
requests for Pig.

As of your question, you can interact with via it's CLI, it is not a service 
(like Hive), as far as I know there's no way to run a Pig "server" remotely, it 
doesn't expose any remote interface, where you can and submit scripts against 
it. The code you mention starts Pig locally, and interacts with remote HDFS and 
Mapreduce.

> How to execute pig script from java
> ---
>
> Key: PIG-5351
> URL: https://issues.apache.org/jira/browse/PIG-5351
> Project: Pig
>  Issue Type: Wish
>Reporter: Atul Raut
>Priority: Minor
>
> How to execute pig script from java class.
> I need to submit Pig script from my java application to the Pig server (This 
> may be on any remote location) and that Pig server will execute that script 
> and return the result to my java application.
>  
> Following is my source,
> public static void main(String[] args) throws Exception {
>  System.setProperty("hadoop.home.dir", 
> "/Pig/hadoop-common-2.2.0-bin-master/");
>  
>  Properties props = new Properties();
>  props.setProperty("fs.default.name", "hdfs://192.168.102.179:8020");
>  props.setProperty("mapred.job.tracker", "192.168.102.179:8021");
>  props.setProperty("pig.use.overriden.hadoop.configs", "true");
> PigServer pig = new PigServer(ExecType.MAPREDUCE, props); 
>  pig.debugOn();
> pig.registerScript("A = LOAD '/apps/employee/sample.txt' USING 
> PigStorage();");
> }
>  
> Thank you in advanced for your support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5348) Tests that need a MiniCluster fail

2018-07-10 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538735#comment-16538735
 ] 

Nandor Kollar commented on PIG-5348:


[~nielsbasjes] ORC related tests the only one which failed? Those failures are 
caused by PIG-5317, and there's a patch which supposed to fix it and is yet to 
be reviewed and committed (if you're familiar with ORC then your feedback is 
really appreciated, I'm not really familiar with it :) ). Could you please 
check if it indeed makes those test green again?

> Tests that need a MiniCluster fail
> --
>
> Key: PIG-5348
> URL: https://issues.apache.org/jira/browse/PIG-5348
> Project: Pig
>  Issue Type: Bug
>Reporter: Niels Basjes
>Priority: Major
>
> While working on PIG-2599 and PIG-5343 I found that some tests always fail.
> A good example is org.apache.pig.builtin.TestOrcStoragePushdown
> The first rough assessment is that the tests that a Hadoop MiniCluster to run 
> are the ones that fail.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support

2018-07-10 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538190#comment-16538190
 ] 

Nandor Kollar commented on PIG-5191:


Any feedback about patch the second patch?

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch, PIG-5191_2.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-07-10 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538189#comment-16538189
 ] 

Nandor Kollar commented on PIG-5317:


[~rohini] could you please have a look at PIG-5317_without_new_dep.patch? Hope 
it fixes the failing test cases, and would be nice to make all green again.

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5344) Update Apache HTTPD LogParser to latest version

2018-06-29 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527648#comment-16527648
 ] 

Nandor Kollar commented on PIG-5344:


Looks good to me!

> Update Apache HTTPD LogParser to latest version
> ---
>
> Key: PIG-5344
> URL: https://issues.apache.org/jira/browse/PIG-5344
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.18.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Major
> Attachments: PIG-5344-1.patch
>
>
> Similar to PIG-4717 this is to simply upgrade the 
> [logparser|https://github.com/nielsbasjes/logparser] library.
> I had to postpone this for a while because the latest version requires Java 8.
> I will simply update the version of the library.
> The new features are supported transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PIG-4822) Spark UT failures after merge from trunk

2018-06-18 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar resolved PIG-4822.

Resolution: Fixed

> Spark UT failures after merge from trunk
> 
>
> Key: PIG-4822
> URL: https://issues.apache.org/jira/browse/PIG-4822
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Pallavi Rao
>Priority: Major
>  Labels: spork
> Fix For: spark-branch
>
>
> New failures:
> [junit] Tests run: 26, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 106.46 sec
> [junit] Test org.apache.pig.test.TestEvalPipeline FAILED
> --
> [junit] Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 55.261 sec
> [junit] Test org.apache.pig.test.TestMultiQuery FAILED
> --
> [junit] Tests run: 31, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 
> 76.905 sec
> [junit] Test org.apache.pig.test.TestPigRunner FAILED
> --
> [junit] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 43.832 sec
> [junit] Test org.apache.pig.test.TestPigServerLocal FAILED
> --
> [junit] Tests run: 71, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 73.983 sec
> [junit] Test org.apache.pig.test.TestPruneColumn FAILED
> --
> [junit] Tests run: 31, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 7.692 sec
> [junit] Test org.apache.pig.test.TestSchema FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestScriptLanguage FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestScriptLanguageJavaScript FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestScriptUDF FAILED
> --
> [junit] Tests run: 0, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 6.21 sec
> [junit] Test org.apache.pig.test.TestSkewedJoin FAILED
> --
> [junit] Tests run: 0, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 6.399 sec
> [junit] Test org.apache.pig.test.TestSplitStore FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestStore FAILED
> --
> [junit] Tests run: 0, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 5.11 sec
> [junit] Test org.apache.pig.test.TestStoreInstances FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestStoreOld FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestStreaming FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestToolsPigServer FAILED
> --
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 
> sec
> [junit] Test org.apache.pig.test.TestUDF FAILED



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PIG-5122) data

2018-06-18 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar resolved PIG-5122.

Resolution: Not A Problem

> data
> 
>
> Key: PIG-5122
> URL: https://issues.apache.org/jira/browse/PIG-5122
> Project: Pig
>  Issue Type: Bug
>  Components: data
>Affects Versions: 0.16.0
>Reporter: muhammad hamdani
>Priority: Major
> Fix For: site
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PIG-5162) Fix failing e2e tests with spark exec type

2018-06-18 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar resolved PIG-5162.

   Resolution: Fixed
Fix Version/s: 0.17.0

> Fix failing e2e tests with spark exec type
> --
>
> Key: PIG-5162
> URL: https://issues.apache.org/jira/browse/PIG-5162
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Priority: Major
> Fix For: spark-branch, 0.17.0
>
>
> Tests were executed on spark branch in spark mode, old Pig was also Pig on 
> spark branch (same), but executed in mapreduce mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-4632) UT TestSplitCombine.test11 failed with unexpected end of schema when Parquet is 1.6.0+

2018-06-18 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515483#comment-16515483
 ] 

Nandor Kollar commented on PIG-4632:


This issue was fixed on trunk as part of PIG-4092, however it is not backported 
to 0.15 branch.

> UT TestSplitCombine.test11 failed with unexpected end of schema when Parquet 
> is 1.6.0+
> --
>
> Key: PIG-4632
> URL: https://issues.apache.org/jira/browse/PIG-4632
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> unexpected end of schema
> java.lang.IllegalArgumentException: unexpected end of schema
> at 
> parquet.schema.MessageTypeParser$Tokenizer.nextToken(MessageTypeParser.java:62)
> at parquet.schema.MessageTypeParser.parse(MessageTypeParser.java:89)
> at 
> parquet.schema.MessageTypeParser.parseMessageType(MessageTypeParser.java:82)
> at parquet.hadoop.ParquetInputSplit.end(ParquetInputSplit.java:96)
> at parquet.hadoop.ParquetInputSplit.(ParquetInputSplit.java:92)
> at 
> org.apache.pig.test.TestSplitCombine.test11(TestSplitCombine.java:528)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support

2018-06-14 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512108#comment-16512108
 ] 

Nandor Kollar commented on PIG-5191:


[~szita], [~daijy], [~rohini] do you think patch #2 looks good and is ready to 
be committed?

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch, PIG-5191_2.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support

2018-06-13 Thread Nandor Kollar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511189#comment-16511189
 ] 

Nandor Kollar commented on PIG-5191:


Updated patch with latest HBase dependency (2.0.0 is released). Apart from 
additional dependencies, one minor change was required in the HBase related 
test cases: setting {{hbase.localcluster.assign.random.ports}} property (added 
a comment to the source file). Test downstream, passed without need to modify 
the bin/pig script.

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch, PIG-5191_2.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support

2018-06-13 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5191:
---
Attachment: PIG-5191_2.patch

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch, PIG-5191_2.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support

2018-06-13 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5191:
---
Attachment: (was: PIG-5191_2.patch)

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support

2018-06-13 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5191:
---
Attachment: PIG-5191_2.patch

> Pig HBase 2.0.0 support
> ---
>
> Key: PIG-5191
> URL: https://issues.apache.org/jira/browse/PIG-5191
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5191_1.patch, PIG-5191_2.patch
>
>
> Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several 
> API changes, we should find a way to support both 1.x and 2.x HBase API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-4092) Predicate pushdown for Parquet

2018-04-10 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431929#comment-16431929
 ] 

Nandor Kollar commented on PIG-4092:


[~rohini] I upgraded the Parquet to the latest and the constructor of 
ParquetInputSplit [parses the schema | 
https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L95],
 thus invalid schema is not allowed any more. I couldn't find the old code, but 
the constructor just set the private schema field before, it didn't parse the 
given string. I guess in this test case we can use just a simple dummy schema, 
it is not important for the test.

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch, PIG-4092_2.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet

2018-04-05 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-4092:
---
Status: Patch Available  (was: Reopened)

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch, PIG-4092_2.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet

2018-04-05 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-4092:
---
Attachment: PIG-4092_2.patch

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch, PIG-4092_2.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet

2018-04-04 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-4092:
---
Attachment: PIG-4092_1.patch

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet

2018-04-04 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-4092:
---
Attachment: (was: PIG-4092_1.patch)

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-4092) Predicate pushdown for Parquet

2018-04-04 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425920#comment-16425920
 ] 

Nandor Kollar commented on PIG-4092:


I'm not sure if anyone uses this wrapper instead of the loader already present 
in Parquet, but I implemented the missing predicate pushdown interface and the 
delegation to the Parquet loader. I also updated the parquet-pig-bundle version 
to the latest one.

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-4092) Predicate pushdown for Parquet

2018-04-04 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-4092:
---
Attachment: PIG-4092_1.patch

> Predicate pushdown for Parquet
> --
>
> Key: PIG-4092
> URL: https://issues.apache.org/jira/browse/PIG-4092
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4092_1.patch
>
>
> See:
> https://github.com/apache/incubator-parquet-mr/pull/4
> and:
> https://github.com/apache/incubator-parquet-mr/blob/master/parquet-column/src/main/java/parquet/filter2/predicate/FilterApi.java
> [~alexlevenson] is the main author of this API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5253) Pig Hadoop 3 support

2018-01-26 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340799#comment-16340799
 ] 

Nandor Kollar commented on PIG-5253:


bq. Can't the hadoop 2 compiled pig.jar be directly used against the Hadoop 3 
cluster?
I think we can. I'll update my patch so that it doesn't compile against Hadoop 
3 and Hadoop 2
bq. hadoop-site.xml was a Hadoop 1.x and pre-YARN thing
Ah ok, I see. I'll get rid of hadoop-site.xml then from MiniCluster.java. I 
also noticed, that in HExecutionEngine also has reference to hadoop-site.xml, 
should I delete that reference too?

I'll upload an updated patch soon.

> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-19 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Status: Patch Available  (was: Reopened)

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (PIG-5253) Pig Hadoop 3 support

2018-01-16 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on PIG-5253 started by Nandor Kollar.
--
> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PIG-5253) Pig Hadoop 3 support

2018-01-15 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar reassigned PIG-5253:
--

Assignee: Nandor Kollar  (was: Adam Szita)

> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
> Fix For: 0.18.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-05 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313277#comment-16313277
 ] 

Nandor Kollar commented on PIG-5317:


Attached PIG-5317_without_new_dep.patch, without new dependencies, but changed 
thresholds for the failing cases. [~rohini] could you please help with a 
review? The new numbers are empirically deduced, not sure if they are 
"correct", not really familiar with ORC predicate pushdown.

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-05 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: PIG-5317_without_new_dep.patch

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, 
> PIG-5317_amend.patch, PIG-5317_without_new_dep.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-04 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311379#comment-16311379
 ] 

Nandor Kollar commented on PIG-5317:


I took a look at the changes in RandomStringUtils, and I think this change is 
related to LANG-1286. Looks like with this overflow fix changed 
RandomStringUtils' API too, though I don't know why does this affect us.

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, PIG-5317_amend.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-03 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309873#comment-16309873
 ] 

Nandor Kollar commented on PIG-5317:


Ouch, it looks TestOrcStoragePushdown fails because of this patch. Attached 
PIG-5317_amend.patch, the documentation says that RandomStringUtils is 
deprecated, and RandomStringGenerator should be used instead. Looks like in 
commons-lang3 the behavior of this deprecated class changed? [~rohini] could 
you please have a look at PIG-5317_amend.patch?

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, PIG-5317_amend.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-03 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: PIG-5317_amend.patch

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch, PIG-5317_amend.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-03 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar reopened PIG-5317:


> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2018-01-02 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308353#comment-16308353
 ] 

Nandor Kollar commented on PIG-5317:


[~rohini] what do you think about PIG-5317_2.patch? Do you see any issue that 
could be caused by upgrading these dependencies?

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-14 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: PIG-5320_2.patch

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch, PIG-5320_2.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289136#comment-16289136
 ] 

Nandor Kollar commented on PIG-5320:


I think this is a problem with Spark 1.6.x too, checking for the condition in a 
loop should solve the problem. I also changed the map and set implementation to 
sorted one, since we use integer job ids, I hope it would slightly improve 
performance in case of many jobs. [~kellyzly], [~szita] could you please have a 
look at my patch? My only concern is: is SparkListener#onJobEnd() called when 
the job fails? If not, then Pig would stuck in an infinite loop.

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: PIG-5320_1.patch

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: (was: PIG-5320_1.patch)

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Status: Patch Available  (was: Open)

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: PIG-5320_1.patch

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar reassigned PIG-5320:
--

Assignee: Nandor Kollar

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5320:
--

 Summary: TestCubeOperator#testRollupBasic is flaky on Spark 2.2
 Key: PIG-5320
 URL: https://issues.apache.org/jira/browse/PIG-5320
 Project: Pig
  Issue Type: Bug
  Components: spark
Reporter: Nandor Kollar


TestCubeOperator#testRollupBasic occasionally fails with
{code}
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store 
alias c
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.PigServer.registerScript(PigServer.java:781)
at org.apache.pig.PigServer.registerScript(PigServer.java:858)
at org.apache.pig.PigServer.registerScript(PigServer.java:821)
at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
at 
org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get the 
rdds of this spark operator: 
at 
org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
at 
org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
at 
org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
at 
org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
at 
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
at 
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
at org.apache.pig.PigServer.execute(PigServer.java:1449)
at org.apache.pig.PigServer.access$500(PigServer.java:119)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
at 
org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
at 
org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
at 
org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
at 
org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
at 
org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
{code}

I think the problem is that in JobStatisticCollector#waitForJobToEnd 
{{sparkListener.wait()}} is not inside a loop, like suggested in wait's javadoc:
{code}
 * As in the one argument version, interrupts and spurious wakeups are
 * possible, and this method should always be used in a loop:
{code}

Thus due to a spurious wakeup, the wait might pass without a notify getting 
called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2017-12-11 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16285800#comment-16285800
 ] 

Nandor Kollar commented on PIG-5317:


[~rohini] the database path is the same, didn't change:
{code}
dbServer.setDatabasePath(0,
"file:" + TMP_DIR + "batchtest;"+

"hsqldb.default_table_type=cached;hsqldb.cache_rows=100;sql.enforce_strict_size=true");
{code}

I added the caching to this path now in PIG-5317_2.patch.

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2017-12-11 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: PIG-5317_2.patch

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5317_1.patch, PIG-5317_2.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2017-12-08 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: PIG-5317_1.patch

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: PIG-5317_1.patch
>
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2017-12-08 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: (was: PIG-5317_1.patch)

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5317) Upgrade old dependencies: commons-lang, hsqldb, commons-logging

2017-12-08 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5317:
---
Attachment: PIG-5317_1.patch

> Upgrade old dependencies: commons-lang, hsqldb, commons-logging
> ---
>
> Key: PIG-5317
> URL: https://issues.apache.org/jira/browse/PIG-5317
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
>
> Pig depends on old version of commons-lang, hsqldb and commons-logging. It 
> would be nice to upgrade the version of these dependencies, for commons-lang 
> Pig should depend on commons-lang3 instead (which is already present in the 
> ivy.xml)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-08 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283321#comment-16283321
 ] 

Nandor Kollar commented on PIG-5318:


Attached PIG-5318_6.patch, found an universal way to tell the current Spark 
version, that works with both Spark 1.6.x and Spark 2.x too, and there's no 
need to start SparkContext. (thanks [~gezapeti] :) )

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-08 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_6.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-08 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283255#comment-16283255
 ] 

Nandor Kollar commented on PIG-5318:


[~kellyzly] thanks for the explanation, in this case I think enabling this test 
is fine, and there's no need to check for Spark version, we don't support older 
Spark versions.

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-07 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281815#comment-16281815
 ] 

Nandor Kollar commented on PIG-5318:


Attached PIG-5318_5.patch which includes fix for TestAssert, TestScalarAliases, 
TestEvalPipeline2, TestStore and TestStoreLocal test cases, but doesn't fix 
TestStoreInstances failure. The Spark version is determined like Rohini 
suggested. I also noticed, that testKeepGoigFailed (fixed the typo in method 
name, now testKeepGoingFailed) was excluded from spark exec type, I enabled 
this test case, since it passed in my environment with 1.6, 2.1 and 2.2 Spark 
versions. [~kellyzly] do you remember why this was excluded? Looks like the 
Jira it is referring to is not yet fixed, despite this the test passes with 
1.6.x Spark.

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-07 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_5.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIG-5319) Investigate why TestStoreInstances fails with Spark 2.2

2017-12-07 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5319:
--

 Summary: Investigate why TestStoreInstances fails with Spark 2.2
 Key: PIG-5319
 URL: https://issues.apache.org/jira/browse/PIG-5319
 Project: Pig
  Issue Type: Bug
  Components: spark
Reporter: Nandor Kollar


TestStoreInstances unit test fails with Spark 2.2.x. It seems in job and task 
commit logic changed a lot since Spark 2.1.x, now it looks like Spark uses a 
different PigOutputFormat when writing to files, and a different one when 
getting the OutputCommitters



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-04 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276747#comment-16276747
 ] 

Nandor Kollar commented on PIG-5318:


Attached PIG-5318_4.patch, it looks like the way I wanted to tell Spark version 
doesn't work on Spark 1.x, using SparkContext#version instead.

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-04 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_4.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-04 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_3.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-04 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276557#comment-16276557
 ] 

Nandor Kollar commented on PIG-5318:


bq. You should just do isSpark2_x (sparkVersion.startsWith("2.")) instead of 
isSpark2_2_x . If Spark 2.3 gets released, then code will have to change.

You're right, but matching for 2.x is not good enough. On Spark 2.1, abortTask 
and abortJob is not called (see SPARK-7953), but looks like in Spark 2.2 this 
is fixed (at least it looks like it is fixed). I'll update the patch soon, we 
should match Spark 2.2+.

bq. Spark should consistently use the same OutputFormat instance in this case

Ok, so I guess this should be a new Jira for Spark, however Spark 2.2 is 
already released, and creates more OutputFormat instances like said before. 
Indeed, we shouldn't modify the test case, but how about modifying 
PigOutputformat, like I did in the patch (making the relevant variables static)?

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-01 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274385#comment-16274385
 ] 

Nandor Kollar commented on PIG-5318:


Attached PIG-5318_2.patch, I addressed Rohini's comments there.

As of {{TestStoreInstances}} failure, it looks like Spark (unlike Tez and 
MapReduce) creates multiple instances from {{PigOutputFormat}} while setting up 
the output committers: 
[setupCommitter|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L74]
 is called from both 
[setupJob|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L138]
 and from 
[setupTask|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L165],
 and {{setupCommitter}} creates a new {{PigOutputFormat}} each time, saving in 
a private variable. In addition, when Spark writes to files, a new 
{{PigOutputFormat}} is [getting 
created|https://github.com/apache/spark/blob/branch-2.2/core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala#L75]
 too, and since POStores are saved and deserialized in configuration, but 
StoreFuncInterface inside stores are 
[transient|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java#L53],
 a new instance of {{STFuncCheckInstances}} is getting created, each time, thus 
{{putNext}} and {{commitTask}} will use different array instances. Not sure if 
it is a bug in Pig, or in Spark, should Spark consistently use the same 
OutputFormat instance in this case?

Making {{reduceStores}}, {{mapStores}}, {{currentConf}} static inside 
{{TestStoreInstances}} would solve the problem, [~rohini], [~kellyzly] what do 
you think about this solution?

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-01 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_2.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-30 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272501#comment-16272501
 ] 

Nandor Kollar commented on PIG-5318:


Thanks [~rohini] and [~kellyzly] for your review!
Hm, I think I understood the point of TestStoreInstances, and indeed, my change 
on that test looks pointless. I'm afraid this might be a bug and not a test 
issue. I'll continue the investigation why it is failing, and what how to fix 
it, so far it looks like commitTask is not called on the correct 
OutputCommitterTestInstances instance, the array is empty.

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-30 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: (was: PIG-5318_2.patch)

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-30 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_2.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-30 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: (was: PIG-5318_2.patch)

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-30 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_2.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-29 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Description: 
There are several failing cases when executing the unit tests with Spark 2.2:
{code}
 org.apache.pig.test.TestAssert#testNegativeWithoutFetch
 org.apache.pig.test.TestAssert#testNegative
 org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
 org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
 org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
 org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
 org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
{code}

All of these are related to fixes/changes in Spark.

TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed by 
asserting on the message of the exception's root cause, looks like on Spark 2.2 
the exception is wrapped into an additional layer.
TestStore and TestStoreLocal failure are also a test related problems: looks 
like SPARK-7953 is fixed in Spark 2.2
The root cause of TestStoreInstances is yet to be found out.

  was:
There are sever failing cases when executing the unit tests with Spark 2.2:
{code}
 org.apache.pig.test.TestAssert#testNegativeWithoutFetch
 org.apache.pig.test.TestAssert#testNegative
 org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
 org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
 org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
 org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
 org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
{code}

All of these are related to fixes/changes in Spark.

TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed by 
asserting on the message of the exception's root cause, looks like on Spark 2.2 
the exception is wrapped into an additional layer.
TestStore and TestStoreLocal failure are also a test related problems: looks 
like SPARK-7953 is fixed in Spark 2.2
The root cause of TestStoreInstances is yet to be found out.


> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-29 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270990#comment-16270990
 ] 

Nandor Kollar commented on PIG-5318:


[~szita], [~kellyzly] could you please have review?

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch
>
>
> There are sever failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-29 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Status: Patch Available  (was: Open)

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch
>
>
> There are sever failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-29 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5318:
---
Attachment: PIG-5318_1.patch

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch
>
>
> There are sever failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs

2017-11-28 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5316:
---
Attachment: PIG-5316_2.patch

> Initialize mapred.task.id property for PoS jobs
> ---
>
> Key: PIG-5316
> URL: https://issues.apache.org/jira/browse/PIG-5316
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Adam Szita
>Assignee: Nandor Kollar
> Fix For: 0.18.0
>
> Attachments: PIG-5316_1.patch, PIG-5316_2.patch
>
>
> Some downstream systems may require the presence of {{mapred.task.id}} 
> property (e.g. HCatalog). This is currently not set when Pig On Spark jobs 
> are started. Let's initialise it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs

2017-11-28 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5316:
---
Attachment: (was: PIG-5316_2.patch)

> Initialize mapred.task.id property for PoS jobs
> ---
>
> Key: PIG-5316
> URL: https://issues.apache.org/jira/browse/PIG-5316
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Adam Szita
>Assignee: Nandor Kollar
> Fix For: 0.18.0
>
> Attachments: PIG-5316_1.patch, PIG-5316_2.patch
>
>
> Some downstream systems may require the presence of {{mapred.task.id}} 
> property (e.g. HCatalog). This is currently not set when Pig On Spark jobs 
> are started. Let's initialise it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-11-28 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5318:
--

 Summary: Unit test failures on Pig on Spark with Spark 2.2
 Key: PIG-5318
 URL: https://issues.apache.org/jira/browse/PIG-5318
 Project: Pig
  Issue Type: Bug
  Components: spark
Reporter: Nandor Kollar
Assignee: Nandor Kollar


There are sever failing cases when executing the unit tests with Spark 2.2:
{code}
 org.apache.pig.test.TestAssert#testNegativeWithoutFetch
 org.apache.pig.test.TestAssert#testNegative
 org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
 org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
 org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
 org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
 org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
{code}

All of these are related to fixes/changes in Spark.

TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed by 
asserting on the message of the exception's root cause, looks like on Spark 2.2 
the exception is wrapped into an additional layer.
TestStore and TestStoreLocal failure are also a test related problems: looks 
like SPARK-7953 is fixed in Spark 2.2
The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs

2017-11-28 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5316:
---
Status: Patch Available  (was: Reopened)

> Initialize mapred.task.id property for PoS jobs
> ---
>
> Key: PIG-5316
> URL: https://issues.apache.org/jira/browse/PIG-5316
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Adam Szita
>Assignee: Nandor Kollar
> Fix For: 0.18.0
>
> Attachments: PIG-5316_1.patch, PIG-5316_2.patch
>
>
> Some downstream systems may require the presence of {{mapred.task.id}} 
> property (e.g. HCatalog). This is currently not set when Pig On Spark jobs 
> are started. Let's initialise it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5316) Initialize mapred.task.id property for PoS jobs

2017-11-28 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268832#comment-16268832
 ] 

Nandor Kollar commented on PIG-5316:


Looks like we can't create TaskAttemptID with default constructor, should use 
the one which gets parameters to avoid NPEs on Hadoop 2.x. Using 
HadoopShims#getNewTaskAttemptID should solve this, attached PIG-5316_2.patch.

> Initialize mapred.task.id property for PoS jobs
> ---
>
> Key: PIG-5316
> URL: https://issues.apache.org/jira/browse/PIG-5316
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Adam Szita
>Assignee: Nandor Kollar
> Fix For: 0.18.0
>
> Attachments: PIG-5316_1.patch, PIG-5316_2.patch
>
>
> Some downstream systems may require the presence of {{mapred.task.id}} 
> property (e.g. HCatalog). This is currently not set when Pig On Spark jobs 
> are started. Let's initialise it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs

2017-11-28 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5316:
---
Attachment: PIG-5316_2.patch

> Initialize mapred.task.id property for PoS jobs
> ---
>
> Key: PIG-5316
> URL: https://issues.apache.org/jira/browse/PIG-5316
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Adam Szita
>Assignee: Nandor Kollar
> Fix For: 0.18.0
>
> Attachments: PIG-5316_1.patch, PIG-5316_2.patch
>
>
> Some downstream systems may require the presence of {{mapred.task.id}} 
> property (e.g. HCatalog). This is currently not set when Pig On Spark jobs 
> are started. Let's initialise it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   4   5   6   >