[jira] [Commented] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200730#comment-16200730 ] Swaapnika Guntaka commented on SPARK-22253: --- After the above error I see a bunch of lost tasks due to the above Exception. And then at the end I see this. Is this something caused by the above or vice versa? ^7/10/11 10:58:46 INFO DAGScheduler: ShuffleMapStage 0 (groupBy at /lib/ComputeUtils.py:46) failed in 6.670 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most $ at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:744) ^ > Python packaging using JDK 8 fails when run using Spark 2.2 > --- > > Key: SPARK-22253 > URL: https://issues.apache.org/jira/browse/SPARK-22253 > Project: Spark > Issue Type: Question > Components: PySpark, Spark Submit >Affects Versions: 2.2.0 >Reporter: Swaapnika Guntaka > > Python packaging fails with {Java EOF Exception} when run using spark submit. > Python packaging is done using JDK 8. > I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to > this. Does this issue still exist? > ??~^Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, > executor 0): java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166) > at > org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) > at > org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65) > at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) > at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128) > at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)^~?? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swaapnika Guntaka updated SPARK-22253: -- Description: Python packaging fails with {Java EOF Exception} when run using spark submit. Python packaging is done using JDK 8. I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to this. Does this issue still exist? ??~^Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, executor 0): java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)^~?? was: Python packaging fails with {Java EOF Exception} when run using spark submit. Python packaging is done using JDK 8. I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to this. Does this issue still exist? > Python packaging using JDK 8 fails when run using Spark 2.2 > --- > > Key: SPARK-22253 > URL: https://issues.apache.org/jira/browse/SPARK-22253 > Project: Spark > Issue Type: Question > Components: PySpark, Spark Submit >Affects Versions: 2.2.0 >Reporter: Swaapnika Guntaka > > Python packaging fails with {Java EOF Exception} when run using spark submit. > Python packaging is done using JDK 8. > I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to > this. Does this issue still exist? > ??~^Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, > executor 0): java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166) > at > org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) > at > org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65) > at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) > at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128) > at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)^~?? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swaapnika Guntaka updated SPARK-22253: -- Shepherd: Sean Owen > Python packaging using JDK 8 fails when run using Spark 2.2 > --- > > Key: SPARK-22253 > URL: https://issues.apache.org/jira/browse/SPARK-22253 > Project: Spark > Issue Type: Question > Components: PySpark, Spark Submit >Affects Versions: 2.2.0 >Reporter: Swaapnika Guntaka > > Python packaging fails with {Java EOF Exception} when run using spark submit. > Python packaging is done using JDK 8. > I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to > this. Does this issue still exist? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2
Swaapnika Guntaka created SPARK-22253: - Summary: Python packaging using JDK 8 fails when run using Spark 2.2 Key: SPARK-22253 URL: https://issues.apache.org/jira/browse/SPARK-22253 Project: Spark Issue Type: Question Components: PySpark, Spark Submit Affects Versions: 2.2.0 Reporter: Swaapnika Guntaka Python packaging fails with {Java EOF Exception} when run using spark submit. Python packaging is done using JDK 8. I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to this. Does this issue still exist? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1911) Warn users if their assembly jars are not built with Java 6
[ https://issues.apache.org/jira/browse/SPARK-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199666#comment-16199666 ] Swaapnika Guntaka commented on SPARK-1911: -- Does this issue still exist with Spark-2.2.? > Warn users if their assembly jars are not built with Java 6 > --- > > Key: SPARK-1911 > URL: https://issues.apache.org/jira/browse/SPARK-1911 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 1.1.0 >Reporter: Andrew Or >Assignee: Sean Owen > Fix For: 1.2.2, 1.3.0 > > > The root cause of the problem is detailed in: > https://issues.apache.org/jira/browse/SPARK-1520. > In short, an assembly jar built with Java 7+ is not always accessible by > Python or other versions of Java (especially Java 6). If the assembly jar is > not built on the cluster itself, this problem may manifest itself in strange > exceptions that are not trivial to debug. This is an issue especially for > PySpark on YARN, which relies on the python files included within the > assembly jar. > Currently we warn users only in make-distribution.sh, but most users build > the jars directly. At the very least we need to emphasize this in the docs > (currently missing entirely). The next step is to add a warning prompt in the > mvn scripts whenever Java 7+ is detected. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org