[jira] [Commented] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200730#comment-16200730
 ] 

Swaapnika Guntaka commented on SPARK-22253:
---

After the above error I see a bunch of lost tasks due to the above Exception. 
And then at the end I see this. Is this something caused by the above or vice 
versa?

^7/10/11 10:58:46 INFO DAGScheduler: ShuffleMapStage 0 (groupBy at 
/lib/ComputeUtils.py:46) failed in 6.670 s due to Job 
aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most $
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166)
at 
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
^

> Python packaging using JDK 8 fails when run using Spark 2.2
> ---
>
> Key: SPARK-22253
> URL: https://issues.apache.org/jira/browse/SPARK-22253
> Project: Spark
>  Issue Type: Question
>  Components: PySpark, Spark Submit
>Affects Versions: 2.2.0
>Reporter: Swaapnika Guntaka
>
> Python packaging fails with {Java EOF Exception} when run using spark submit.
> Python packaging is done using JDK 8. 
> I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to 
> this. Does this issue still exist?
> ??~^Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, 
> executor 0): java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
> org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166)
> at 
> org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
> at 
> org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65)
> at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
> at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
> at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)^~??



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swaapnika Guntaka updated SPARK-22253:
--
Description: 
Python packaging fails with {Java EOF Exception} when run using spark submit.
Python packaging is done using JDK 8. 
I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to 
this. Does this issue still exist?

??~^Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, executor 
0): java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166)
at 
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)^~??

  was:
Python packaging fails with {Java EOF Exception} when run using spark submit.
Python packaging is done using JDK 8. 
I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to 
this. Does this issue still exist?


> Python packaging using JDK 8 fails when run using Spark 2.2
> ---
>
> Key: SPARK-22253
> URL: https://issues.apache.org/jira/browse/SPARK-22253
> Project: Spark
>  Issue Type: Question
>  Components: PySpark, Spark Submit
>Affects Versions: 2.2.0
>Reporter: Swaapnika Guntaka
>
> Python packaging fails with {Java EOF Exception} when run using spark submit.
> Python packaging is done using JDK 8. 
> I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to 
> this. Does this issue still exist?
> ??~^Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, 
> executor 0): java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
> org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166)
> at 
> org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
> at 
> org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65)
> at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
> at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
> at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)^~??



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swaapnika Guntaka updated SPARK-22253:
--
Shepherd: Sean Owen

> Python packaging using JDK 8 fails when run using Spark 2.2
> ---
>
> Key: SPARK-22253
> URL: https://issues.apache.org/jira/browse/SPARK-22253
> Project: Spark
>  Issue Type: Question
>  Components: PySpark, Spark Submit
>Affects Versions: 2.2.0
>Reporter: Swaapnika Guntaka
>
> Python packaging fails with {Java EOF Exception} when run using spark submit.
> Python packaging is done using JDK 8. 
> I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to 
> this. Does this issue still exist?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)
Swaapnika Guntaka created SPARK-22253:
-

 Summary: Python packaging using JDK 8 fails when run using Spark 
2.2
 Key: SPARK-22253
 URL: https://issues.apache.org/jira/browse/SPARK-22253
 Project: Spark
  Issue Type: Question
  Components: PySpark, Spark Submit
Affects Versions: 2.2.0
Reporter: Swaapnika Guntaka


Python packaging fails with {Java EOF Exception} when run using spark submit.
Python packaging is done using JDK 8. 
I see SPARK-1911[https://issues.apache.org/jira/browse/SPARK-1911] similar to 
this. Does this issue still exist?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1911) Warn users if their assembly jars are not built with Java 6

2017-10-10 Thread Swaapnika Guntaka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199666#comment-16199666
 ] 

Swaapnika Guntaka commented on SPARK-1911:
--

Does this issue still exist with Spark-2.2.? 

> Warn users if their assembly jars are not built with Java 6
> ---
>
> Key: SPARK-1911
> URL: https://issues.apache.org/jira/browse/SPARK-1911
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Andrew Or
>Assignee: Sean Owen
> Fix For: 1.2.2, 1.3.0
>
>
> The root cause of the problem is detailed in: 
> https://issues.apache.org/jira/browse/SPARK-1520.
> In short, an assembly jar built with Java 7+ is not always accessible by 
> Python or other versions of Java (especially Java 6). If the assembly jar is 
> not built on the cluster itself, this problem may manifest itself in strange 
> exceptions that are not trivial to debug. This is an issue especially for 
> PySpark on YARN, which relies on the python files included within the 
> assembly jar.
> Currently we warn users only in make-distribution.sh, but most users build 
> the jars directly. At the very least we need to emphasize this in the docs 
> (currently missing entirely). The next step is to add a warning prompt in the 
> mvn scripts whenever Java 7+ is detected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org