[jira] [Commented] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843746#comment-16843746 ] Adam Szita commented on PIG-5387: - Thanks Nandor, +1 for [^PIG-5387_3.patch]. [~rohini], [~knoguchi] any objections? > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch, PIG-5387_3.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841368#comment-16841368 ] Adam Szita commented on PIG-5387: - So as far as I understand: * in the testing part you are injecting an URLClassLoader if we're running java11. Quite hacky but not worse than the current implementation which uses reflection to invoke addURL on URLClassLoader; * in Operator class I think using readObject like this will be fine, but please add javadoc or comments to the readObject method that includes some reference to the JVM bug. We don't want ppl to remove it by mistake :) Having this in mind [^PIG-5387_2.patch] looks good to me. > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PIG-5387) Test failures on JRE 11
[ https://issues.apache.org/jira/browse/PIG-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita reassigned PIG-5387: --- Assignee: Nandor Kollar > Test failures on JRE 11 > --- > > Key: PIG-5387 > URL: https://issues.apache.org/jira/browse/PIG-5387 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Attachments: PIG-5387_1.patch, PIG-5387_2.patch > > > I tried to compile Pig with JDK 8 and execute the test with Java 11, and > faced with several test failures. For example TestCommit#testCheckin2 failed > with the following exception: > {code} > 2019-05-08 16:06:09,712 WARN [Thread-108] mapred.LocalJobRunner > (LocalJobRunner.java:run(590)) - job_local1000317333_0003 > java.lang.Exception: java.io.IOException: Deserialization error: null > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) > Caused by: java.io.IOException: Deserialization error: null > at > org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:62) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:183) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at org.apache.pig.impl.plan.Operator.hashCode(Operator.java:106) > at java.base/java.util.HashMap.hash(HashMap.java:339) > at java.base/java.util.HashMap.readObject(HashMap.java:1461) > at > java.base/jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160) > {code} > It deserialization of one of the map plan failed, it appears we ran into > [JDK-8201131|https://bugs.openjdk.java.net/browse/JDK-8201131]. I seems that > the workaround in the issue report works, adding a readObject method to > org.apache.pig.impl.plan.Operator: > {code} > private void readObject(ObjectInputStream in) throws > ClassNotFoundException, IOException { > in.defaultReadObject(); > } > {code} > solves the problem, however I'm not sure that this is the optimal solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5374) Use CircularFifoBuffer in InterRecordReader
[ https://issues.apache.org/jira/browse/PIG-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5374: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Use CircularFifoBuffer in InterRecordReader > --- > > Key: PIG-5374 > URL: https://issues.apache.org/jira/browse/PIG-5374 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5374.0.patch > > > We're currently using CircularFifoQueue in InterRecordReader, and it comes > from commons-collections4 dependency. Hadoop 2.8 installations do not have > this dependency by default, so for now we should switch to the older > CircularFifoBuffer instead (which comes from commons-collections and it's > present). > We should open a separate ticket for investigating what libraries should we > update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5374) Use CircularFifoBuffer in InterRecordReader
[ https://issues.apache.org/jira/browse/PIG-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736987#comment-16736987 ] Adam Szita commented on PIG-5374: - Thanks Nandor, committed to trunk. > Use CircularFifoBuffer in InterRecordReader > --- > > Key: PIG-5374 > URL: https://issues.apache.org/jira/browse/PIG-5374 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5374.0.patch > > > We're currently using CircularFifoQueue in InterRecordReader, and it comes > from commons-collections4 dependency. Hadoop 2.8 installations do not have > this dependency by default, so for now we should switch to the older > CircularFifoBuffer instead (which comes from commons-collections and it's > present). > We should open a separate ticket for investigating what libraries should we > update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5374) Use CircularFifoBuffer in InterRecordReader
[ https://issues.apache.org/jira/browse/PIG-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5374: Attachment: PIG-5374.0.patch > Use CircularFifoBuffer in InterRecordReader > --- > > Key: PIG-5374 > URL: https://issues.apache.org/jira/browse/PIG-5374 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5374.0.patch > > > We're currently using CircularFifoQueue in InterRecordReader, and it comes > from commons-collections4 dependency. Hadoop 2.8 installations do not have > this dependency by default, so for now we should switch to the older > CircularFifoBuffer instead (which comes from commons-collections and it's > present). > We should open a separate ticket for investigating what libraries should we > update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5374) Use CircularFifoBuffer in InterRecordReader
Adam Szita created PIG-5374: --- Summary: Use CircularFifoBuffer in InterRecordReader Key: PIG-5374 URL: https://issues.apache.org/jira/browse/PIG-5374 Project: Pig Issue Type: Bug Reporter: Adam Szita Assignee: Adam Szita We're currently using CircularFifoQueue in InterRecordReader, and it comes from commons-collections4 dependency. Hadoop 2.8 installations do not have this dependency by default, so for now we should switch to the older CircularFifoBuffer instead (which comes from commons-collections and it's present). We should open a separate ticket for investigating what libraries should we update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5362) Parameter substitution of shell cmd results doesn't handle backslash
[ https://issues.apache.org/jira/browse/PIG-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735845#comment-16735845 ] Adam Szita commented on PIG-5362: - Hi [~wla...@yahoo-inc.com], is there any update on fixing the failing tests? > Parameter substitution of shell cmd results doesn't handle backslash > - > > Key: PIG-5362 > URL: https://issues.apache.org/jira/browse/PIG-5362 > Project: Pig > Issue Type: Bug > Components: parser >Reporter: Will Lauer >Assignee: Will Lauer >Priority: Minor > Fix For: 0.18.0 > > Attachments: pig.patch, pig2.patch, pig3.patch, pig4.patch, > pig5.patch, test-failure.txt > > > It looks like there is a bug in how parameter substitution is handled in > PreprocessorContext.java that causes parameter values that contain > backslashed to not be processed correctly, resulting in the backslashes being > lost. For example, if you had the following: > {code:java} > %DECLARE A `echo \$foo\\bar` > B = LOAD $A > {code} > You would expect the echo command to produce the output {{$foo\bar}} but the > actual value that gets substituted is {{\$foobar}}. This is happening because > the {{substitute}} method in PreprocessorContext.java uses a regular > expression replacement instead of a basic string substitution and $ and \ are > special characters. The code attempts to escape $, but does not escape > backslash. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5373.0.patch, PIG-5373.1.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733171#comment-16733171 ] Adam Szita commented on PIG-5373: - Committed to trunk, thanks a lot for reviewing Nandor! > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch, PIG-5373.1.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Attachment: PIG-5373.1.patch > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch, PIG-5373.1.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Attachment: (was: PIG-5373.1.patch) > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732852#comment-16732852 ] Adam Szita commented on PIG-5373: - Thanks for taking a look [~nkollar], I've uploaded a new patch that uses CircularFifoQueue from commons-collections4. > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch, PIG-5373.1.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Attachment: PIG-5373.1.patch > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch, PIG-5373.1.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728019#comment-16728019 ] Adam Szita commented on PIG-5373: - [~rohini], right, it is not even released yet, so I just leave it blank then > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Affects Version/s: (was: 0.17.0) > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (PIG-5371) Hdfs bytes written assertions fail in TestPigRunner
[ https://issues.apache.org/jira/browse/PIG-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725861#comment-16725861 ] Adam Szita edited comment on PIG-5371 at 12/20/18 1:56 PM: --- Hi [~abstractdog], Yeah sorry that's a typo, indeed -Dtestcase should be used. The test doesn't hang on my side, it finishes successfully in 9 minutes. Do you see which test method completes on your side and which one doesn't? In the past when I faced with the hanging issue was due to my Mac's HDD had over 90% utilisation which some HDFS code in MiniCluster did not like was (Author: szita): Hi [~abstractdog], Yeah sorry that's a typo, indeed -Dtestcase should be used. The doesn't hang on my side, it finishes successfully in 9 minutes. Do you see which test method completes on your side and which one doesn't? In the past when I faced with the hanging issue was due to my Mac's HDD had over 90% utilisation which some HDFS code in MiniCluster did not like > Hdfs bytes written assertions fail in TestPigRunner > --- > > Key: PIG-5371 > URL: https://issues.apache.org/jira/browse/PIG-5371 > Project: Pig > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: PIG-5371.01.patch, simpleTest.out > > > Attached [^simpleTest.out]. It seems like HDFS counter 'HDFS_BYTES_WRITTEN' > returns the byte count not only for the result of pig store operator, but it > includes the size of the jar files as well. The problem is this could change > very easily, so in my opinion the best would be to remove these assertions > from TestPigRunner as this is just causing intermittent and/or persistent > failures. > The test class is for basic testing of PigRunner, and this is achieved well > enough without the asserts. > {code} > 2018-11-23 10:14:52,661 [IPC Server handler 5 on 54929] INFO > org.apache.hadoop.hdfs.StateChange - BLOCK* allocate blk_1073741827_1003, > replicas=127.0.0.1:54934, 127.0.0.1:54930, 127.0.0.1:54943 for > /tmp/temp-157262781/tmp-1057655772/automaton-1.11-8.jar > ... > 2018-11-23 10:14:52,735 [PacketResponder: > BP-26001448-10.200.50.195-1542964474138:blk_1073741827_1003, > type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:54930, > 127.0.0.1:54943]] INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace - src: > /127.0.0.1:54978, dest: /127.0.0.1:54934, bytes: 176285, op: HDFS_WRITE, > cliID: DFSClient_NONMAPREDUCE_-1959727442_1, offset: 0, srvID: > 108c4000-1ae0-402e-82cf-bf403629c0f7, blockid: > BP-26001448-10.200.50.195-1542964474138:blk_1073741827_1003, duration(ns): > 57162859 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5371) Hdfs bytes written assertions fail in TestPigRunner
[ https://issues.apache.org/jira/browse/PIG-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725861#comment-16725861 ] Adam Szita commented on PIG-5371: - Hi [~abstractdog], Yeah sorry that's a typo, indeed -Dtestcase should be used. The doesn't hang on my side, it finishes successfully in 9 minutes. Do you see which test method completes on your side and which one doesn't? In the past when I faced with the hanging issue was due to my Mac's HDD had over 90% utilisation which some HDFS code in MiniCluster did not like > Hdfs bytes written assertions fail in TestPigRunner > --- > > Key: PIG-5371 > URL: https://issues.apache.org/jira/browse/PIG-5371 > Project: Pig > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: PIG-5371.01.patch, simpleTest.out > > > Attached [^simpleTest.out]. It seems like HDFS counter 'HDFS_BYTES_WRITTEN' > returns the byte count not only for the result of pig store operator, but it > includes the size of the jar files as well. The problem is this could change > very easily, so in my opinion the best would be to remove these assertions > from TestPigRunner as this is just causing intermittent and/or persistent > failures. > The test class is for basic testing of PigRunner, and this is achieved well > enough without the asserts. > {code} > 2018-11-23 10:14:52,661 [IPC Server handler 5 on 54929] INFO > org.apache.hadoop.hdfs.StateChange - BLOCK* allocate blk_1073741827_1003, > replicas=127.0.0.1:54934, 127.0.0.1:54930, 127.0.0.1:54943 for > /tmp/temp-157262781/tmp-1057655772/automaton-1.11-8.jar > ... > 2018-11-23 10:14:52,735 [PacketResponder: > BP-26001448-10.200.50.195-1542964474138:blk_1073741827_1003, > type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:54930, > 127.0.0.1:54943]] INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace - src: > /127.0.0.1:54978, dest: /127.0.0.1:54934, bytes: 176285, op: HDFS_WRITE, > cliID: DFSClient_NONMAPREDUCE_-1959727442_1, offset: 0, srvID: > 108c4000-1ae0-402e-82cf-bf403629c0f7, blockid: > BP-26001448-10.200.50.195-1542964474138:blk_1073741827_1003, duration(ns): > 57162859 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Attachment: PIG-5373.0.patch > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Status: Patch Available (was: Open) > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725832#comment-16725832 ] Adam Szita commented on PIG-5373: - Attached [^PIG-5373.0.patch] which corrects the reading of sync markers using a fifo, and compares the fifo content with the expected marker. Test case attached, which verifies in a brute force way, that such prefix scenarios are handled well. [~nkollar], [~rohini] can you take a look please? > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Comment: was deleted (was: Attached [^PIG-5373.0.patch] which corrects the reading of sync markers using a fifo, and compares the fifo content with the expected marker. Test case attached, which verifies in a brute force way, that such prefix scenarios are handled well. [~nkollar], [~rohini] can you take a look please?) > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725831#comment-16725831 ] Adam Szita commented on PIG-5373: - Attached [^PIG-5373.0.patch] which corrects the reading of sync markers using a fifo, and compares the fifo content with the expected marker. Test case attached, which verifies in a brute force way, that such prefix scenarios are handled well. [~nkollar], [~rohini] can you take a look please? > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: PIG-5373.0.patch > > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
[ https://issues.apache.org/jira/browse/PIG-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5373: Description: Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can happen that sync markers are not identified while reading the interim binary file used to hold data between jobs. In such files sync markers are placed upon writing, which later help during reading the data. These are random generated and it seems like that in some rare combinations of markers and data preceding it, they cannot be not found. This can result in reading through all the bytes (looking for the marker) and reaching split end or EOF, and extracting no records at all. This symptom is also observable from JobHistory stats, where if a job is affected by this issue, will have tasks that have HDFS_BYTES_READ or FILE_BYTES_READ about equal to the number bytes of the split, but at the same time having MAP_INPUT_RECORDS=0 One such (test) example is this: {code:java} marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, 3]{code} Due to a bug, such markers whose prefix overlap with the last data chunk are not seen by the reader. was: Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can happen that sync markers are not identified while reading the interim binary file used to hold data between jobs. In such files sync markers are placed upon writing, which later help during reading the data. These are random generated and it seems like that in some rare combinations of markers and data preceding it, they cannot be not found. This can result in reading through all the bytes (looking for the marker) and reaching split end or EOF, and extracting no records at all. This symptom is also observable from JobHistory stats, where if a job is affected by this issue, will have tasks that have HDFS_BYTES_READ or FILE_BYTES_READ about equal to the number bytes of the split, but at the same time having MAP_INPUT_RECORDS=0 > InterRecordReader might skip records if certain sync markers are used > - > > Key: PIG-5373 > URL: https://issues.apache.org/jira/browse/PIG-5373 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > > Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can > happen that sync markers are not identified while reading the interim binary > file used to hold data between jobs. > In such files sync markers are placed upon writing, which later help during > reading the data. These are random generated and it seems like that in some > rare combinations of markers and data preceding it, they cannot be not found. > This can result in reading through all the bytes (looking for the marker) and > reaching split end or EOF, and extracting no records at all. > This symptom is also observable from JobHistory stats, where if a job is > affected by this issue, will have tasks that have HDFS_BYTES_READ or > FILE_BYTES_READ about equal to the number bytes of the split, but at the same > time having MAP_INPUT_RECORDS=0 > One such (test) example is this: > {code:java} > marker: [-128, -128, 4] , data: [127, -1, 2, -128, -128, -128, 4, 1, 2, > 3]{code} > Due to a bug, such markers whose prefix overlap with the last data chunk are > not seen by the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5373) InterRecordReader might skip records if certain sync markers are used
Adam Szita created PIG-5373: --- Summary: InterRecordReader might skip records if certain sync markers are used Key: PIG-5373 URL: https://issues.apache.org/jira/browse/PIG-5373 Project: Pig Issue Type: Bug Affects Versions: 0.17.0 Reporter: Adam Szita Assignee: Adam Szita Due to bug in InterRecordReader#skipUntilMarkerOrSplitEndOrEOF(), it can happen that sync markers are not identified while reading the interim binary file used to hold data between jobs. In such files sync markers are placed upon writing, which later help during reading the data. These are random generated and it seems like that in some rare combinations of markers and data preceding it, they cannot be not found. This can result in reading through all the bytes (looking for the marker) and reaching split end or EOF, and extracting no records at all. This symptom is also observable from JobHistory stats, where if a job is affected by this issue, will have tasks that have HDFS_BYTES_READ or FILE_BYTES_READ about equal to the number bytes of the split, but at the same time having MAP_INPUT_RECORDS=0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5371) Hdfs bytes written assertions fail in TestPigRunner
[ https://issues.apache.org/jira/browse/PIG-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721419#comment-16721419 ] Adam Szita commented on PIG-5371: - Hi [~abstractdog], can you please elaborate on {quote}TestPigRunner - work, only on an internal maintenance line {quote} I am able to run TestPigRunner checked out from trunk as per: {code:java} ant clean jar ant test -Dtest=TestPigRunner{code} ..and it succeeds: {code:java} BUILD SUCCESSFUL Total time: 8 minutes 47 seconds{code} > Hdfs bytes written assertions fail in TestPigRunner > --- > > Key: PIG-5371 > URL: https://issues.apache.org/jira/browse/PIG-5371 > Project: Pig > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: PIG-5371.01.patch, simpleTest.out > > > Attached [^simpleTest.out]. It seems like HDFS counter 'HDFS_BYTES_WRITTEN' > returns the byte count not only for the result of pig store operator, but it > includes the size of the jar files as well. The problem is this could change > very easily, so in my opinion the best would be to remove these assertions > from TestPigRunner as this is just causing intermittent and/or persistent > failures. > The test class is for basic testing of PigRunner, and this is achieved well > enough without the asserts. > {code} > 2018-11-23 10:14:52,661 [IPC Server handler 5 on 54929] INFO > org.apache.hadoop.hdfs.StateChange - BLOCK* allocate blk_1073741827_1003, > replicas=127.0.0.1:54934, 127.0.0.1:54930, 127.0.0.1:54943 for > /tmp/temp-157262781/tmp-1057655772/automaton-1.11-8.jar > ... > 2018-11-23 10:14:52,735 [PacketResponder: > BP-26001448-10.200.50.195-1542964474138:blk_1073741827_1003, > type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:54930, > 127.0.0.1:54943]] INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace - src: > /127.0.0.1:54978, dest: /127.0.0.1:54934, bytes: 176285, op: HDFS_WRITE, > cliID: DFSClient_NONMAPREDUCE_-1959727442_1, offset: 0, srvID: > 108c4000-1ae0-402e-82cf-bf403629c0f7, blockid: > BP-26001448-10.200.50.195-1542964474138:blk_1073741827_1003, duration(ns): > 57162859 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PIG-2557) CSVExcelStorage save : empty quotes "" becomes 4 quotes """". This should become a null field.
[ https://issues.apache.org/jira/browse/PIG-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita resolved PIG-2557. - Resolution: Duplicate Fix Version/s: 0.17.0 > CSVExcelStorage save : empty quotes "" becomes 4 quotes . This should > become a null field. > --- > > Key: PIG-2557 > URL: https://issues.apache.org/jira/browse/PIG-2557 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.9.1 >Reporter: Peter Welch >Priority: Minor > Fix For: 0.17.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-2557) CSVExcelStorage save : empty quotes "" becomes 4 quotes """". This should become a null field.
[ https://issues.apache.org/jira/browse/PIG-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629193#comment-16629193 ] Adam Szita commented on PIG-2557: - Since this is the same issue as PIG-5045, I'm resolving this > CSVExcelStorage save : empty quotes "" becomes 4 quotes . This should > become a null field. > --- > > Key: PIG-2557 > URL: https://issues.apache.org/jira/browse/PIG-2557 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.9.1 >Reporter: Peter Welch >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5358) Remove hive-contrib jar from lib directory
[ https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5358: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Remove hive-contrib jar from lib directory > -- > > Key: PIG-5358 > URL: https://issues.apache.org/jira/browse/PIG-5358 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5358.0.patch > > > As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We > 'export' some of our Hive dependencies into our lib folder too, and that > includes hive-contrib.jar so in order to be synced with Hive we should remove > it too. > We don't depend on this jar runtime so there's no use of it being in Pig's > lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5358) Remove hive-contrib jar from lib directory
[ https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619576#comment-16619576 ] Adam Szita commented on PIG-5358: - Committed to trunk, thanks for reviewing Nandor! > Remove hive-contrib jar from lib directory > -- > > Key: PIG-5358 > URL: https://issues.apache.org/jira/browse/PIG-5358 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5358.0.patch > > > As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We > 'export' some of our Hive dependencies into our lib folder too, and that > includes hive-contrib.jar so in order to be synced with Hive we should remove > it too. > We don't depend on this jar runtime so there's no use of it being in Pig's > lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5358) Remove hive-contrib jar from lib directory
[ https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5358: Status: Patch Available (was: In Progress) > Remove hive-contrib jar from lib directory > -- > > Key: PIG-5358 > URL: https://issues.apache.org/jira/browse/PIG-5358 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Minor > Attachments: PIG-5358.0.patch > > > As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We > 'export' some of our Hive dependencies into our lib folder too, and that > includes hive-contrib.jar so in order to be synced with Hive we should remove > it too. > We don't depend on this jar runtime so there's no use of it being in Pig's > lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5358) Remove hive-contrib jar from lib directory
[ https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5358: Attachment: PIG-5358.0.patch > Remove hive-contrib jar from lib directory > -- > > Key: PIG-5358 > URL: https://issues.apache.org/jira/browse/PIG-5358 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Minor > Attachments: PIG-5358.0.patch > > > As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We > 'export' some of our Hive dependencies into our lib folder too, and that > includes hive-contrib.jar so in order to be synced with Hive we should remove > it too. > We don't depend on this jar runtime so there's no use of it being in Pig's > lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PIG-5358) Remove hive-contrib jar from lib directory
Adam Szita created PIG-5358: --- Summary: Remove hive-contrib jar from lib directory Key: PIG-5358 URL: https://issues.apache.org/jira/browse/PIG-5358 Project: Pig Issue Type: Improvement Components: build Reporter: Adam Szita Assignee: Adam Szita As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We 'export' some of our Hive dependencies into our lib folder too, and that includes hive-contrib.jar so in order to be synced with Hive we should remove it too. We don't depend on this jar runtime so there's no use of it being in Pig's lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (PIG-5358) Remove hive-contrib jar from lib directory
[ https://issues.apache.org/jira/browse/PIG-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on PIG-5358 started by Adam Szita. --- > Remove hive-contrib jar from lib directory > -- > > Key: PIG-5358 > URL: https://issues.apache.org/jira/browse/PIG-5358 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Minor > > As per HIVE-20020 hive-contrib jar is moved out of under Hive's lib. We > 'export' some of our Hive dependencies into our lib folder too, and that > includes hive-contrib.jar so in order to be synced with Hive we should remove > it too. > We don't depend on this jar runtime so there's no use of it being in Pig's > lib dir anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5343) Upgrade developer build environment
[ https://issues.apache.org/jira/browse/PIG-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5343: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Upgrade developer build environment > --- > > Key: PIG-5343 > URL: https://issues.apache.org/jira/browse/PIG-5343 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5343-1.patch, PIG-5343-2.patch, PIG-5343-3.patch > > > The docker image that can be used to setup the build environment still uses > Java 1.7 and is based on a very old version of Ubuntu. > Both of these should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5343) Upgrade developer build environment
[ https://issues.apache.org/jira/browse/PIG-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607128#comment-16607128 ] Adam Szita commented on PIG-5343: - [~nielsbasjes], Thanks for the investigation on this one. +1 on [^PIG-5343-3.patch], it is now committed to trunk. > Upgrade developer build environment > --- > > Key: PIG-5343 > URL: https://issues.apache.org/jira/browse/PIG-5343 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5343-1.patch, PIG-5343-2.patch, PIG-5343-3.patch > > > The docker image that can be used to setup the build environment still uses > Java 1.7 and is based on a very old version of Ubuntu. > Both of these should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5340) Unable to compile files in PIG
[ https://issues.apache.org/jira/browse/PIG-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16601871#comment-16601871 ] Adam Szita commented on PIG-5340: - Me neither. I did the following: {code:java} git clone https://github.com/apache/pig.git cd pig git checkout tags/release-0.17.0 ant clean jar cd tutorial ant jar{code} After which I've got a BUILD SUCCESSFUL. I'm resolving this jira as not a problem - we can reopen in the very unlikely case that it indeed turns out to be an issue > Unable to compile files in PIG > -- > > Key: PIG-5340 > URL: https://issues.apache.org/jira/browse/PIG-5340 > Project: Pig > Issue Type: Bug >Reporter: Remil >Priority: Major > > hadoopuser@sherin-VirtualBox:/usr/local/pig/pig-0.17.0-src/tutorial$ sudo ant > jar > Buildfile: /usr/local/pig/pig-0.17.0-src/tutorial/build.xml > init: > compile: > [echo] *** Compiling Tutorial files *** > [javac] /usr/local/pig/pig-0.17.0-src/tutorial/build.xml:66: warning: > 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set > to false for repeatable builds > [javac] Compiling 7 source files to > /usr/local/pig/pig-0.17.0-src/tutorial/build/classes > [javac] warning: [options] bootstrap class path not set in conjunction with > -source 1.5 > [javac] warning: [options] source value 1.5 is obsolete and will be removed > in a future release > [javac] warning: [options] target value 1.5 is obsolete and will be removed > in a future release > [javac] warning: [options] To suppress warnings about obsolete options, use > -Xlint:-options. > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:24: > error: cannot find symbol > [javac] import org.apache.pig.EvalFunc; > [javac] ^ > [javac] symbol: class EvalFunc > [javac] location: package org.apache.pig > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:25: > error: cannot find symbol > [javac] import org.apache.pig.FuncSpec; > [javac] ^ > [javac] symbol: class FuncSpec > [javac] location: package org.apache.pig > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:26: > error: package org.apache.pig.data does not exist > [javac] import org.apache.pig.data.Tuple; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:27: > error: package org.apache.pig.data does not exist > [javac] import org.apache.pig.data.DataType; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:28: > error: package org.apache.pig.impl.logicalLayer.schema does not exist > [javac] import org.apache.pig.impl.logicalLayer.schema.Schema; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:29: > error: package org.apache.pig.impl.logicalLayer does not exist > [javac] import org.apache.pig.impl.logicalLayer.FrontendException; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:35: > error: cannot find symbol > [javac] public class ExtractHour extends EvalFunc { > [javac] ^ > [javac] symbol: class EvalFunc > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:36: > error: cannot find symbol > [javac] public String exec(Tuple input) throws IOException { > [javac] ^ > [javac] symbol: class Tuple > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:54: > error: cannot find symbol > [javac] public Schema outputSchema(Schema input) { > [javac] ^ > [javac] symbol: class Schema > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:54: > error: cannot find symbol > [javac] public Schema outputSchema(Schema input) { > [javac] ^ > [javac] symbol: class Schema > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:63: > error: cannot find symbol > [javac] public List getArgToFuncMapping() throws FrontendException > { > [javac] ^ > [javac] symbol: class FuncSpec > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:63: > error: cannot find symbol > [javac] public List getArgToFuncMapping() throws FrontendException > { > [javac] ^ > [javac] symbol: class FrontendException > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/NGramGenerator.java:26: > error: cannot find symbol > [javac] import
[jira] [Resolved] (PIG-5340) Unable to compile files in PIG
[ https://issues.apache.org/jira/browse/PIG-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita resolved PIG-5340. - Resolution: Not A Problem > Unable to compile files in PIG > -- > > Key: PIG-5340 > URL: https://issues.apache.org/jira/browse/PIG-5340 > Project: Pig > Issue Type: Bug >Reporter: Remil >Priority: Major > > hadoopuser@sherin-VirtualBox:/usr/local/pig/pig-0.17.0-src/tutorial$ sudo ant > jar > Buildfile: /usr/local/pig/pig-0.17.0-src/tutorial/build.xml > init: > compile: > [echo] *** Compiling Tutorial files *** > [javac] /usr/local/pig/pig-0.17.0-src/tutorial/build.xml:66: warning: > 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set > to false for repeatable builds > [javac] Compiling 7 source files to > /usr/local/pig/pig-0.17.0-src/tutorial/build/classes > [javac] warning: [options] bootstrap class path not set in conjunction with > -source 1.5 > [javac] warning: [options] source value 1.5 is obsolete and will be removed > in a future release > [javac] warning: [options] target value 1.5 is obsolete and will be removed > in a future release > [javac] warning: [options] To suppress warnings about obsolete options, use > -Xlint:-options. > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:24: > error: cannot find symbol > [javac] import org.apache.pig.EvalFunc; > [javac] ^ > [javac] symbol: class EvalFunc > [javac] location: package org.apache.pig > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:25: > error: cannot find symbol > [javac] import org.apache.pig.FuncSpec; > [javac] ^ > [javac] symbol: class FuncSpec > [javac] location: package org.apache.pig > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:26: > error: package org.apache.pig.data does not exist > [javac] import org.apache.pig.data.Tuple; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:27: > error: package org.apache.pig.data does not exist > [javac] import org.apache.pig.data.DataType; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:28: > error: package org.apache.pig.impl.logicalLayer.schema does not exist > [javac] import org.apache.pig.impl.logicalLayer.schema.Schema; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:29: > error: package org.apache.pig.impl.logicalLayer does not exist > [javac] import org.apache.pig.impl.logicalLayer.FrontendException; > [javac] ^ > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:35: > error: cannot find symbol > [javac] public class ExtractHour extends EvalFunc { > [javac] ^ > [javac] symbol: class EvalFunc > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:36: > error: cannot find symbol > [javac] public String exec(Tuple input) throws IOException { > [javac] ^ > [javac] symbol: class Tuple > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:54: > error: cannot find symbol > [javac] public Schema outputSchema(Schema input) { > [javac] ^ > [javac] symbol: class Schema > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:54: > error: cannot find symbol > [javac] public Schema outputSchema(Schema input) { > [javac] ^ > [javac] symbol: class Schema > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:63: > error: cannot find symbol > [javac] public List getArgToFuncMapping() throws FrontendException > { > [javac] ^ > [javac] symbol: class FuncSpec > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/ExtractHour.java:63: > error: cannot find symbol > [javac] public List getArgToFuncMapping() throws FrontendException > { > [javac] ^ > [javac] symbol: class FrontendException > [javac] location: class ExtractHour > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/NGramGenerator.java:26: > error: cannot find symbol > [javac] import org.apache.pig.EvalFunc; > [javac] ^ > [javac] symbol: class EvalFunc > [javac] location: package org.apache.pig > [javac] > /usr/local/pig/pig-0.17.0-src/tutorial/src/org/apache/pig/tutorial/NGramGenerator.java:27: > error: cannot find symbol > [javac] import org.apache.pig.FuncSpec; > [javac] ^ > [javac] symbol: class FuncSpec >
[jira] [Commented] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596222#comment-16596222 ] Adam Szita commented on PIG-5191: - Looks good to me too. Committed to trunk. Thanks for the patch Nandor, and thanks for reviewing Rohini, Daniel. > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5191) Pig HBase 2.0.0 support
[ https://issues.apache.org/jira/browse/PIG-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5191: Resolution: Fixed Status: Resolved (was: Patch Available) > Pig HBase 2.0.0 support > --- > > Key: PIG-5191 > URL: https://issues.apache.org/jira/browse/PIG-5191 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5191_1.patch, PIG-5191_2.patch > > > Pig doesn't support HBase 2.0.0. Since the new HBase API introduces several > API changes, we should find a way to support both 1.x and 2.x HBase API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5347) Add new target for generating dependency tree
[ https://issues.apache.org/jira/browse/PIG-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541676#comment-16541676 ] Adam Szita commented on PIG-5347: - [~satishsaley] thanks for looking into this. Yes the ivy:report is there, but currently that only works for the active ivy config (e.g. spark1 and spark2 related libs are not included in the report) It'd be very useful to have them all - right now when I look for something that spark pulls in, I have to take a look in ~/.ivy2/ and dig though xmls. So this might not be such an invalid Jira ticket after all. > Add new target for generating dependency tree > - > > Key: PIG-5347 > URL: https://issues.apache.org/jira/browse/PIG-5347 > Project: Pig > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Major > > It would be really helpful in debugging dependency conflicts if we have some > easy way to get dependency tree. ivy:report - > http://ant.apache.org/ivy/history/latest-milestone/use/report.html task > generates html showing dependencies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5344) Update Apache HTTPD LogParser to latest version
[ https://issues.apache.org/jira/browse/PIG-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530078#comment-16530078 ] Adam Szita commented on PIG-5344: - +1, patch committed to trunk. Thanks for the patch [~nielsbasjes], and [~nkollar] for the review! > Update Apache HTTPD LogParser to latest version > --- > > Key: PIG-5344 > URL: https://issues.apache.org/jira/browse/PIG-5344 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.18.0 >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5344-1.patch > > > Similar to PIG-4717 this is to simply upgrade the > [logparser|https://github.com/nielsbasjes/logparser] library. > I had to postpone this for a while because the latest version requires Java 8. > I will simply update the version of the library. > The new features are supported transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5344) Update Apache HTTPD LogParser to latest version
[ https://issues.apache.org/jira/browse/PIG-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5344: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Update Apache HTTPD LogParser to latest version > --- > > Key: PIG-5344 > URL: https://issues.apache.org/jira/browse/PIG-5344 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.18.0 >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5344-1.patch > > > Similar to PIG-4717 this is to simply upgrade the > [logparser|https://github.com/nielsbasjes/logparser] library. > I had to postpone this for a while because the latest version requires Java 8. > I will simply update the version of the library. > The new features are supported transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5343) Upgrade developer build environment
[ https://issues.apache.org/jira/browse/PIG-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530007#comment-16530007 ] Adam Szita commented on PIG-5343: - [~nielsbasjes], are the failing tests passing under java 7? > Upgrade developer build environment > --- > > Key: PIG-5343 > URL: https://issues.apache.org/jira/browse/PIG-5343 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Attachments: PIG-5343-1.patch, PIG-5343-2.patch, PIG-5343-3.patch > > > The docker image that can be used to setup the build environment still uses > Java 1.7 and is based on a very old version of Ubuntu. > Both of these should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5122) data
[ https://issues.apache.org/jira/browse/PIG-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5122: Fix Version/s: (was: site) > data > > > Key: PIG-5122 > URL: https://issues.apache.org/jira/browse/PIG-5122 > Project: Pig > Issue Type: Bug > Components: data >Affects Versions: 0.16.0 >Reporter: muhammad hamdani >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5343) Upgrade developer build environment
[ https://issues.apache.org/jira/browse/PIG-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522044#comment-16522044 ] Adam Szita commented on PIG-5343: - In {{TestDriverPig.pm}} we might want to leave the MaxPermSize setting. E2E tests can be used to compare results of the same Pig script using a new and an old release of Pig, and some test clusters might have java7 installed on them. > Upgrade developer build environment > --- > > Key: PIG-5343 > URL: https://issues.apache.org/jira/browse/PIG-5343 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Major > Attachments: PIG-5343-1.patch, PIG-5343-2.patch > > > The docker image that can be used to setup the build environment still uses > Java 1.7 and is based on a very old version of Ubuntu. > Both of these should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5341) PigStorage with -tagFile/-tagPath produces incorrect results with column pruning
[ https://issues.apache.org/jira/browse/PIG-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501494#comment-16501494 ] Adam Szita commented on PIG-5341: - +1, thanks for fixing this Koji! > PigStorage with -tagFile/-tagPath produces incorrect results with column > pruning > > > Key: PIG-5341 > URL: https://issues.apache.org/jira/browse/PIG-5341 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Critical > Attachments: pig-5341-v01.patch > > > I don't know why we didn't see this till now. > {code} > A = load 'test.txt' using PigStorage('\t', '-tagFile') as > (filename:chararray, a0:int, a1:int, a2:int, a3:int); > B = FOREACH A GENERATE a0,a2; > dump B; > {code} > Input > {noformat} > knoguchi@pig > cat test.txt > 0 1 2 3 > 0 1 2 3 > 0 1 2 3 > {noformat} > Expected Results > {noformat} > (0,2) > (0,2) > (0,2) > {noformat} > Actual Results > {noformat} > (,1) > (,1) > (,1) > {noformat} > This is really bad... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5338) Prevent deep copy of DataBag into Jython List
[ https://issues.apache.org/jira/browse/PIG-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448317#comment-16448317 ] Adam Szita commented on PIG-5338: - Yeah those errors look to be related to this patch - perhaps some classloading related issue, different classpath on MR side vs what is tested locally. > Prevent deep copy of DataBag into Jython List > - > > Key: PIG-5338 > URL: https://issues.apache.org/jira/browse/PIG-5338 > Project: Pig > Issue Type: Improvement >Reporter: Greg Phillips >Assignee: Greg Phillips >Priority: Major > Attachments: PIG-5338.patch > > > Pig Python UDFs currently perform deep copies on Bags converting them into > Jython PyLists. This can cause Jython UDFs to run out of memory and fail. A > Jython DataBag which extends PyList could allow for iterative access to > DataBag elements, while only performing a deep copy when necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5338) Prevent deep copy of DataBag into Jython List
[ https://issues.apache.org/jira/browse/PIG-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445478#comment-16445478 ] Adam Szita commented on PIG-5338: - This looks like a good idea - although we'll also need to run (Scripting) e2e tests for verification. > Prevent deep copy of DataBag into Jython List > - > > Key: PIG-5338 > URL: https://issues.apache.org/jira/browse/PIG-5338 > Project: Pig > Issue Type: Improvement >Reporter: Greg Phillips >Assignee: Greg Phillips >Priority: Major > Attachments: PIG-5338.patch > > > Pig Python UDFs currently perform deep copies on Bags converting them into > Jython PyLists. This can cause Jython UDFs to run out of memory and fail. A > Jython DataBag which extends PyList could allow for iterative access to > DataBag elements, while only performing a deep copy when necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334305#comment-16334305 ] Adam Szita commented on PIG-5253: - {quote}If there is no change required for hadoop 2, we can just point it to compile from hadoop 2 shims directory. {quote} Wouldn't that be a source of confusion? Also if we keep the shims layer, what do we do with the maven classifiers in build.xml? I guess we would want to keep that structure as well (for future usage as said before) although I don't think we want to have a -h2 and -h3 jar with the very same content. > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326329#comment-16326329 ] Adam Szita commented on PIG-5320: - +1 on [^PIG-5320_2.patch], and committed to trunk. Thanks Nandor! > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5320_1.patch, PIG-5320_2.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5320: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5320_1.patch, PIG-5320_2.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
[ https://issues.apache.org/jira/browse/PIG-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5325: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Schema disambiguation can't be turned off for nested schemas > > > Key: PIG-5325 > URL: https://issues.apache.org/jira/browse/PIG-5325 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: 0.18.0 > > Attachments: PIG-5325.0.patch > > > PIG-5110 introduced the feature to turn off automatic schema field alias > disambiguation, removing parent alias and the ':' char. It seems like this > doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
[ https://issues.apache.org/jira/browse/PIG-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16318178#comment-16318178 ] Adam Szita commented on PIG-5325: - Patch committed to trunk, thanks for the review Rohini! > Schema disambiguation can't be turned off for nested schemas > > > Key: PIG-5325 > URL: https://issues.apache.org/jira/browse/PIG-5325 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: 0.18.0 > > Attachments: PIG-5325.0.patch > > > PIG-5110 introduced the feature to turn off automatic schema field alias > disambiguation, removing parent alias and the ':' char. It seems like this > doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
[ https://issues.apache.org/jira/browse/PIG-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5325: Status: Patch Available (was: In Progress) > Schema disambiguation can't be turned off for nested schemas > > > Key: PIG-5325 > URL: https://issues.apache.org/jira/browse/PIG-5325 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5325.0.patch > > > PIG-5110 introduced the feature to turn off automatic schema field alias > disambiguation, removing parent alias and the ':' char. It seems like this > doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
[ https://issues.apache.org/jira/browse/PIG-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309875#comment-16309875 ] Adam Szita commented on PIG-5325: - Attached [^PIG-5325.0.patch] for fixing this. [~mikebush], [~rohini] can you take a look please? > Schema disambiguation can't be turned off for nested schemas > > > Key: PIG-5325 > URL: https://issues.apache.org/jira/browse/PIG-5325 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5325.0.patch > > > PIG-5110 introduced the feature to turn off automatic schema field alias > disambiguation, removing parent alias and the ':' char. It seems like this > doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
[ https://issues.apache.org/jira/browse/PIG-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5325: Attachment: PIG-5325.0.patch > Schema disambiguation can't be turned off for nested schemas > > > Key: PIG-5325 > URL: https://issues.apache.org/jira/browse/PIG-5325 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5325.0.patch > > > PIG-5110 introduced the feature to turn off automatic schema field alias > disambiguation, removing parent alias and the ':' char. It seems like this > doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5110) Removing schema alias and :: coming from parent relation
[ https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309727#comment-16309727 ] Adam Szita commented on PIG-5110: - Thanks for catching this [~mikebush], I'll address this problem in PIG-5325. > Removing schema alias and :: coming from parent relation > > > Key: PIG-5110 > URL: https://issues.apache.org/jira/browse/PIG-5110 > Project: Pig > Issue Type: New Feature >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: 0.17.0 > > Attachments: PIG-5110.0.patch, PIG-5110.1.patch, PIG-5110.2.patch > > > Customers have asked for a feature to get rid of the schema alias prefixes. > CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field > alias and :: > I would like to find a way to disable this feature. (The burden of making > sure not to have duplicate aliases - and hence the appropriate > FrontendException getting thrown - is on the user) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
[ https://issues.apache.org/jira/browse/PIG-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on PIG-5325 started by Adam Szita. --- > Schema disambiguation can't be turned off for nested schemas > > > Key: PIG-5325 > URL: https://issues.apache.org/jira/browse/PIG-5325 > Project: Pig > Issue Type: Bug >Reporter: Adam Szita >Assignee: Adam Szita > > PIG-5110 introduced the feature to turn off automatic schema field alias > disambiguation, removing parent alias and the ':' char. It seems like this > doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5325) Schema disambiguation can't be turned off for nested schemas
Adam Szita created PIG-5325: --- Summary: Schema disambiguation can't be turned off for nested schemas Key: PIG-5325 URL: https://issues.apache.org/jira/browse/PIG-5325 Project: Pig Issue Type: Bug Reporter: Adam Szita Assignee: Adam Szita PIG-5110 introduced the feature to turn off automatic schema field alias disambiguation, removing parent alias and the ':' char. It seems like this doesn't work for nested schemas. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-4764) Make Pig work with Hive 2.0
[ https://issues.apache.org/jira/browse/PIG-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309473#comment-16309473 ] Adam Szita commented on PIG-4764: - I think this would be handy to have in 0.18 > Make Pig work with Hive 2.0 > --- > > Key: PIG-4764 > URL: https://issues.apache.org/jira/browse/PIG-4764 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.18.0 > > Attachments: PIG-4764-0.patch, PIG-4764-1.patch, PIG-4764-2.patch, > PIG-4764-3.patch, PIG-4764-4.patch > > > There are a lot of changes especially around ORC in Hive 2.0. We need to make > Pig work with it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290835#comment-16290835 ] Adam Szita commented on PIG-5320: - [~nkollar] patch looks good, but can you elaborate on your reason for changing hash-based implementations to tree-based ones for the sets and maps used in this class? I would think that the number of jobs here would very rarely be high (if I think that most Pig jobs are started in batch mode with a script specified, so the only jobs here are what that one script generates) > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > -- > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5318: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289058#comment-16289058 ] Adam Szita commented on PIG-5318: - [~nkollar], +1 for [^PIG-5318_6.patch], committed to trunk. I think we should also upgrade the spark 2 minor version in Pig On Spark to 2.2. We don't want to maintain a 1.6.1, 2.1.1, and 2.2.0 support at the same time, rather have one minor per major. Created PIG-5321 to track the upgrade. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5321) Upgrade Spark 2 version to 2.2.0 for Pig on Spark
Adam Szita created PIG-5321: --- Summary: Upgrade Spark 2 version to 2.2.0 for Pig on Spark Key: PIG-5321 URL: https://issues.apache.org/jira/browse/PIG-5321 Project: Pig Issue Type: Improvement Components: spark Reporter: Adam Szita Right now we maintain support for 2 versions of Spark for PoS jobs: spark1.version=1.6.1 spark2.version=2.1.1 I believe we should move forward with the latter. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5310) MergeJoin throwing NullPointer Exception
[ https://issues.apache.org/jira/browse/PIG-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270572#comment-16270572 ] Adam Szita commented on PIG-5310: - +1 on [^PIG-5310-2.patch] > MergeJoin throwing NullPointer Exception > > > Key: PIG-5310 > URL: https://issues.apache.org/jira/browse/PIG-5310 > Project: Pig > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: PIG-5310-1.patch, PIG-5310-2.patch > > > Merge join throws NullPointerException if left input's first key doesn't > exist in right input and if it is smaller than first key of right input. > For ex > |left|right| > |1|3| > |1|5| > |1| | > Error we get - > {code} > ERROR 2998: Unhandled internal error. Vertex failed, vertexName=scope-16, > vertexId=vertex_1509400259446_0001_1_02, diagnostics=[Task failed, > taskId=task_1509400259446_0001_1_02_00, diagnostics=[TaskAttempt 0 > failed, info=[Error: Error while running task ( failure ) : > attempt_1509400259446_0001_1_02_00_0:java.lang.NullPointerException > at java.lang.Integer.compareTo(Integer.java:1216) > at java.lang.Integer.compareTo(Integer.java:52) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNextTuple(POMergeJoin.java:525) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:123) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:416) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:281) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1945) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Here, the key used in join is an integer. Integer.compareTo(other) method > throws null pointer exception if comparison is made against null. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5316: Resolution: Fixed Status: Resolved (was: Patch Available) > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268913#comment-16268913 ] Adam Szita commented on PIG-5316: - Good catch Nandor, fix committed! > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch, PIG-5316_2.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268704#comment-16268704 ] Adam Szita commented on PIG-5316: - Committed to trunk, thanks Nandor and Xuefu! > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5316: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5316_1.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5316) Initialize mapred.task.id property for PoS jobs
[ https://issues.apache.org/jira/browse/PIG-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263989#comment-16263989 ] Adam Szita commented on PIG-5316: - [~nkollar] +1 on the patch, unless objections by [~xuefuz] I'll commit tomorrow > Initialize mapred.task.id property for PoS jobs > --- > > Key: PIG-5316 > URL: https://issues.apache.org/jira/browse/PIG-5316 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Adam Szita >Assignee: Nandor Kollar > Attachments: PIG-5316_1.patch > > > Some downstream systems may require the presence of {{mapred.task.id}} > property (e.g. HCatalog). This is currently not set when Pig On Spark jobs > are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5316) Initialize mapred.task.id property for PoS jobs
Adam Szita created PIG-5316: --- Summary: Initialize mapred.task.id property for PoS jobs Key: PIG-5316 URL: https://issues.apache.org/jira/browse/PIG-5316 Project: Pig Issue Type: Improvement Components: spark Reporter: Adam Szita Assignee: Nandor Kollar Some downstream systems may require the presence of {{mapred.task.id}} property (e.g. HCatalog). This is currently not set when Pig On Spark jobs are started. Let's initialise it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5302) Remove HttpClient dependency
[ https://issues.apache.org/jira/browse/PIG-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251104#comment-16251104 ] Adam Szita commented on PIG-5302: - Committed to trunk. Thanks for the patch Nandor, and for the review Rohini. > Remove HttpClient dependency > > > Key: PIG-5302 > URL: https://issues.apache.org/jira/browse/PIG-5302 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5302_1.patch, PIG-5302_2.patch, PIG-5302_3.patch, > PIG-5302_4.patch, ivy-report.css, org.apache.pig-pig-compile.html > > > Pig depends on Apache Commons HttpClient 3.1 which is an old version with > security problems > ([CVE-2015-5262|https://cve.mitre.org/cgi-bin/cvename.cgi?name=%20CVE-2015-5262]) > Also, Pig depends on Apache HttpComponents (it also needs update to newer > version due to similar reason), which is the successor of HttpClient, thus we > should remove HttpClient dependency, and update HttpComponents to 4.4+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5302) Remove HttpClient dependency
[ https://issues.apache.org/jira/browse/PIG-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5302: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Remove HttpClient dependency > > > Key: PIG-5302 > URL: https://issues.apache.org/jira/browse/PIG-5302 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5302_1.patch, PIG-5302_2.patch, PIG-5302_3.patch, > PIG-5302_4.patch, ivy-report.css, org.apache.pig-pig-compile.html > > > Pig depends on Apache Commons HttpClient 3.1 which is an old version with > security problems > ([CVE-2015-5262|https://cve.mitre.org/cgi-bin/cvename.cgi?name=%20CVE-2015-5262]) > Also, Pig depends on Apache HttpComponents (it also needs update to newer > version due to similar reason), which is the successor of HttpClient, thus we > should remove HttpClient dependency, and update HttpComponents to 4.4+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5302) Remove HttpClient dependency
[ https://issues.apache.org/jira/browse/PIG-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5302: Issue Type: Improvement (was: Bug) > Remove HttpClient dependency > > > Key: PIG-5302 > URL: https://issues.apache.org/jira/browse/PIG-5302 > Project: Pig > Issue Type: Improvement >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5302_1.patch, PIG-5302_2.patch, PIG-5302_3.patch, > PIG-5302_4.patch, ivy-report.css, org.apache.pig-pig-compile.html > > > Pig depends on Apache Commons HttpClient 3.1 which is an old version with > security problems > ([CVE-2015-5262|https://cve.mitre.org/cgi-bin/cvename.cgi?name=%20CVE-2015-5262]) > Also, Pig depends on Apache HttpComponents (it also needs update to newer > version due to similar reason), which is the successor of HttpClient, thus we > should remove HttpClient dependency, and update HttpComponents to 4.4+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5302) Remove HttpClient dependency
[ https://issues.apache.org/jira/browse/PIG-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249526#comment-16249526 ] Adam Szita commented on PIG-5302: - +1 on latest patch. I successfully ran test-commit for verification. > Remove HttpClient dependency > > > Key: PIG-5302 > URL: https://issues.apache.org/jira/browse/PIG-5302 > Project: Pig > Issue Type: Bug >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5302_1.patch, PIG-5302_2.patch, PIG-5302_3.patch, > PIG-5302_4.patch, ivy-report.css, org.apache.pig-pig-compile.html > > > Pig depends on Apache Commons HttpClient 3.1 which is an old version with > security problems > ([CVE-2015-5262|https://cve.mitre.org/cgi-bin/cvename.cgi?name=%20CVE-2015-5262]) > Also, Pig depends on Apache HttpComponents (it also needs update to newer > version due to similar reason), which is the successor of HttpClient, thus we > should remove HttpClient dependency, and update HttpComponents to 4.4+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5302) Remove HttpClient dependency
[ https://issues.apache.org/jira/browse/PIG-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245403#comment-16245403 ] Adam Szita commented on PIG-5302: - There looks to be a lot of unused / old copy-pasted entries in our ivy file. Thanks for your efforts to clean this up, [^PIG-5302_3.patch] looks good to me, +1 pending that all unit tests pass. > Remove HttpClient dependency > > > Key: PIG-5302 > URL: https://issues.apache.org/jira/browse/PIG-5302 > Project: Pig > Issue Type: Bug >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5302_1.patch, PIG-5302_2.patch, PIG-5302_3.patch, > ivy-report.css, org.apache.pig-pig-compile.html > > > Pig depends on Apache Commons HttpClient 3.1 which is an old version with > security problems > ([CVE-2015-5262|https://cve.mitre.org/cgi-bin/cvename.cgi?name=%20CVE-2015-5262]) > Also, Pig depends on Apache HttpComponents (it also needs update to newer > version due to similar reason), which is the successor of HttpClient, thus we > should remove HttpClient dependency, and update HttpComponents to 4.4+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: 0.18.0 > > Attachments: PIG-5305.0.patch, PIG-5305.1.patch, PIG-5305.2.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194512#comment-16194512 ] Adam Szita commented on PIG-5305: - Thanks for the review [~kellyzly], latest patch is now committed to trunk. > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch, PIG-5305.2.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-3864) ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight Saving Time with location based timezones
[ https://issues.apache.org/jira/browse/PIG-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189577#comment-16189577 ] Adam Szita commented on PIG-3864: - +1 on [^PIG-3864-1.patch], it is a good fix [~daijy] > ToDate(userstring, format, timezone) computes DateTime with strange handling > of Daylight Saving Time with location based timezones > -- > > Key: PIG-3864 > URL: https://issues.apache.org/jira/browse/PIG-3864 > Project: Pig > Issue Type: Bug >Affects Versions: 0.12.0, 0.11.1 >Reporter: Frederic Schmaljohann >Assignee: Daniel Dai > Fix For: 0.18.0 > > Attachments: PIG-3864-1.patch > > > When using ToDate with a location based timezone (e.g. "Europe/Berlin") the > handling of the timezone offset is based on whether the timezone is currently > in daylight saving and not based on whether the timestamp is in daylight > saving time or not. > Example: > {noformat} > B = FOREACH A GENERATE ToDate('2014-02-02 18:00:00.000Z', '-MM-dd > HH:mm:ss.SSSZ', 'Europe/Berlin') AS Timestamp; > {noformat} > This yields > {noformat}2014-02-02 20:00:00.000+02{noformat} > when called during daylight saving in Europe/Berlin although I would expect > {noformat}2014-02-02 19:00:00.000+01{noformat} > During standard time In Europe/Berlin, the above call yields > {noformat}2014-02-02 19:00:00.000+01{noformat} > In Europe/Berlin DST started on March 30th, 2014. > This seems pretty strange to me. If it is on purpose it should at least be > noted in the documentation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5302) Remove HttpClient dependency
[ https://issues.apache.org/jira/browse/PIG-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189514#comment-16189514 ] Adam Szita commented on PIG-5302: - [~nkollar] do all unit tests pass with this change? > Remove HttpClient dependency > > > Key: PIG-5302 > URL: https://issues.apache.org/jira/browse/PIG-5302 > Project: Pig > Issue Type: Bug >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5302_1.patch, PIG-5302_2.patch > > > Pig depends on Apache Commons HttpClient 3.1 which is an old version with > security problems > ([CVE-2015-5262|https://cve.mitre.org/cgi-bin/cvename.cgi?name=%20CVE-2015-5262]) > Also, Pig depends on Apache HttpComponents (it also needs update to newer > version due to similar reason), which is the successor of HttpClient, thus we > should remove HttpClient dependency, and update HttpComponents to 4.4+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176186#comment-16176186 ] Adam Szita commented on PIG-5305: - [~kellyzly] do you think this is ready for commit now? > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch, PIG-5305.2.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174837#comment-16174837 ] Adam Szita commented on PIG-5305: - [~kellyzly] yes {{src.exclude.dir}} was probably just left there, and had no use since the removal of Hadoop 1 support. Then Spark 2 support came with PIG-5157, and as you correctly point it out, resetting src.exclude.dir does influence {{jar}} target. The reason we didn't see this before is because nobody used the {{test-tez}} target, in the Apache Jenkins job we use {{test-core-mrtez}] which runs all MR and then all Tez unit tests. > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch, PIG-5305.2.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5298) Verify if org.mortbay.jetty is removable
[ https://issues.apache.org/jira/browse/PIG-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5298: Resolution: Fixed Fix Version/s: 0.18.0 Status: Resolved (was: Patch Available) > Verify if org.mortbay.jetty is removable > > > Key: PIG-5298 > URL: https://issues.apache.org/jira/browse/PIG-5298 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Nandor Kollar > Fix For: 0.18.0 > > Attachments: PIG-5298_1.patch, PIG-5298_2.patch, PIG-5298_3.patch > > > Although we pull in jetty libraries in ivy Pig does not depend on > org.mortbay.jetty explicitly. The only exception I see is in Piggybank where > I think this can be swapped by javax.el-api and log4j. > We should investigate (check build, run unit tests across all exec modes) and > remove if it turns out to be unnecessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5298) Verify if org.mortbay.jetty is removable
[ https://issues.apache.org/jira/browse/PIG-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174768#comment-16174768 ] Adam Szita commented on PIG-5298: - +1, [^PIG-5298_3.patch] committed to trunk. Thanks for taking care of this Nandor > Verify if org.mortbay.jetty is removable > > > Key: PIG-5298 > URL: https://issues.apache.org/jira/browse/PIG-5298 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Nandor Kollar > Attachments: PIG-5298_1.patch, PIG-5298_2.patch, PIG-5298_3.patch > > > Although we pull in jetty libraries in ivy Pig does not depend on > org.mortbay.jetty explicitly. The only exception I see is in Piggybank where > I think this can be swapped by javax.el-api and log4j. > We should investigate (check build, run unit tests across all exec modes) and > remove if it turns out to be unnecessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171478#comment-16171478 ] Adam Szita commented on PIG-5305: - [~kellyzly] 1: removed the dependecy from test-tez. I also checked, test-tez was not running properly since the Spark 2 support commit, because {{setTezEnv}} was clearing the excluded sources property. I fixed this in my latest patch as well. 2: There were quite a couple of failures at first, that's why I had to add a reset feature of SparkContexts into SparkLauncher. With the latest patch it shouldn't have any failures. > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch, PIG-5305.2.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Attachment: PIG-5305.2.patch > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch, PIG-5305.2.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Status: Patch Available (was: Open) > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169815#comment-16169815 ] Adam Szita commented on PIG-5305: - Thanks for the comments [~kellyzly]. Attached [^PIG-5305.1.patch]. 1. Correct, test-core-mrtez indeed doesn't need jar-simple, I removed that. However I'd like to keep pigtest-jar target calls in test related targets. For example if someone launches {{ant clean test -Dtest.exec.type=spark}} we have too keep it on {{test-core}} target as well. 2. Added comment as requested. > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Attachment: PIG-5305.1.patch > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Attachment: (was: PIG-5305.1.patch) > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Attachment: PIG-5305.1.patch > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch, PIG-5305.1.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5298) Verify if org.mortbay.jetty is removable
[ https://issues.apache.org/jira/browse/PIG-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167730#comment-16167730 ] Adam Szita commented on PIG-5298: - [~nkollar], if currently the only reason the pull in jetty is to make use of its EL implementation which as you say in the newer versions is basically borrowed from that of Glassfish the logical thing to do IMHO would be to use just glassfish EL, no jetty and no tomcat. I believe we should always try to reduce the number and size of dependent libraries to only those that we actually make use of. > Verify if org.mortbay.jetty is removable > > > Key: PIG-5298 > URL: https://issues.apache.org/jira/browse/PIG-5298 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Nandor Kollar > Attachments: PIG-5298_1.patch > > > Although we pull in jetty libraries in ivy Pig does not depend on > org.mortbay.jetty explicitly. The only exception I see is in Piggybank where > I think this can be swapped by javax.el-api and log4j. > We should investigate (check build, run unit tests across all exec modes) and > remove if it turns out to be unnecessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166543#comment-16166543 ] Adam Szita commented on PIG-5305: - Attached [^PIG-5305.0.patch] to enable running tests in yarn-client mode for Spark execution. Main changes: * build.xml: added target to build a jar with all test classes. This is required so that we can pass this test jar onto SparkContext which then distributes it among Spark executors + set SPARK_MASTER env var to "yarn-client" * SparkLauncher: added feature to re-initialize SparkContext when switching between cluster and local mode PigServers + only setting ChildFirstURLClassLoader during cluster mode [~kellyzly] can you please take a look? > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Description: See parent jira (PIG-5305) for problem description > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch > > > See parent jira (PIG-5305) for problem description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
[ https://issues.apache.org/jira/browse/PIG-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5305: Attachment: PIG-5305.0.patch > Enable yarn-client mode execution of tests in Spark (1) mode > > > Key: PIG-5305 > URL: https://issues.apache.org/jira/browse/PIG-5305 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Attachments: PIG-5305.0.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5297) Yarn-client mode doesn't work with Spark 2
[ https://issues.apache.org/jira/browse/PIG-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5297: Attachment: PIG-5297.0.patch > Yarn-client mode doesn't work with Spark 2 > -- > > Key: PIG-5297 > URL: https://issues.apache.org/jira/browse/PIG-5297 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > > When running tests in yarn-client mode that were built with Spark 2 I'm > getting the following exception: > {code} > Caused by: java.lang.IllegalStateException: Library directory > './pig/assembly/target/scala-2.11/jars' does not exist; make sure Spark > is built. > at > org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248) > at > org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(CommandBuilderUtils.java:368) > at > org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(YarnCommandBuilderUtils.scala:38) > at > org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:558) > at > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:882) > {code} > After overcoming this with symlinks and setting SPARK_HOME I hit another > issue: > {code} > Caused by: java.lang.NoSuchMethodError: > io.netty.channel.DefaultFileRegion.(Ljava/io/File;JJ)V > at > org.apache.spark.network.buffer.FileSegmentManagedBuffer.convertToNetty(FileSegmentManagedBuffer.java:133) > at > org.apache.spark.network.protocol.MessageEncoder.encode(MessageEncoder.java:58) > at > org.apache.spark.network.protocol.MessageEncoder.encode(MessageEncoder.java:33) > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89) > {code} > I believe this will be an incompatibility between netty-all versions required > by hadoop and spark.. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PIG-5297) Yarn-client mode doesn't work with Spark 2
[ https://issues.apache.org/jira/browse/PIG-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5297: Attachment: (was: PIG-5297.0.patch) > Yarn-client mode doesn't work with Spark 2 > -- > > Key: PIG-5297 > URL: https://issues.apache.org/jira/browse/PIG-5297 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > > When running tests in yarn-client mode that were built with Spark 2 I'm > getting the following exception: > {code} > Caused by: java.lang.IllegalStateException: Library directory > './pig/assembly/target/scala-2.11/jars' does not exist; make sure Spark > is built. > at > org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248) > at > org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(CommandBuilderUtils.java:368) > at > org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(YarnCommandBuilderUtils.scala:38) > at > org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:558) > at > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:882) > {code} > After overcoming this with symlinks and setting SPARK_HOME I hit another > issue: > {code} > Caused by: java.lang.NoSuchMethodError: > io.netty.channel.DefaultFileRegion.(Ljava/io/File;JJ)V > at > org.apache.spark.network.buffer.FileSegmentManagedBuffer.convertToNetty(FileSegmentManagedBuffer.java:133) > at > org.apache.spark.network.protocol.MessageEncoder.encode(MessageEncoder.java:58) > at > org.apache.spark.network.protocol.MessageEncoder.encode(MessageEncoder.java:33) > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89) > {code} > I believe this will be an incompatibility between netty-all versions required > by hadoop and spark.. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PIG-5305) Enable yarn-client mode execution of tests in Spark (1) mode
Adam Szita created PIG-5305: --- Summary: Enable yarn-client mode execution of tests in Spark (1) mode Key: PIG-5305 URL: https://issues.apache.org/jira/browse/PIG-5305 Project: Pig Issue Type: Sub-task Components: spark Reporter: Adam Szita Assignee: Adam Szita -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5298) Verify if org.mortbay.jetty is removable
[ https://issues.apache.org/jira/browse/PIG-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161025#comment-16161025 ] Adam Szita commented on PIG-5298: - Thanks for looking into this [~nkollar]. Few comments: * please list an entry for the tomcat version (9.0.0.M26) in libraries.properties instead of ivy.xml * list the tomcat dependencies in the pom templates for pig and piggybank > Verify if org.mortbay.jetty is removable > > > Key: PIG-5298 > URL: https://issues.apache.org/jira/browse/PIG-5298 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.17.0 >Reporter: Adam Szita >Assignee: Nandor Kollar > Attachments: PIG-5298_1.patch > > > Although we pull in jetty libraries in ivy Pig does not depend on > org.mortbay.jetty explicitly. The only exception I see is in Piggybank where > I think this can be swapped by javax.el-api and log4j. > We should investigate (check build, run unit tests across all exec modes) and > remove if it turns out to be unnecessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029)