The exception in Python means that the worker try to read command from JVM, but it reach the end of socket (socket had been closed). So it's possible that there another exception happened in JVM.
Could you change the log level of log4j, then check is there any problem inside JVM? Davies On Wed, Jul 30, 2014 at 9:12 AM, Nicholas Chammas <nicholas.cham...@gmail.com> wrote: > Any clues? This looks like a bug, but I can't report it without more precise > information. > > > On Tue, Jul 29, 2014 at 9:56 PM, Nick Chammas <nicholas.cham...@gmail.com> > wrote: >> >> I’m in the PySpark shell and I’m trying to do this: >> >> a = >> sc.textFile('s3n://path-to-handful-of-very-large-files-totalling-1tb/*.json', >> minPartitions=sc.defaultParallelism * 3).cache() >> a.map(lambda x: len(x)).max() >> >> My job dies with the following: >> >> 14/07/30 01:46:28 WARN TaskSetManager: Loss was due to >> org.apache.spark.api.python.PythonException >> org.apache.spark.api.python.PythonException: Traceback (most recent call >> last): >> File "/root/spark/python/pyspark/worker.py", line 73, in main >> command = pickleSer._read_with_length(infile) >> File "/root/spark/python/pyspark/serializers.py", line 142, in >> _read_with_length >> length = read_int(stream) >> File "/root/spark/python/pyspark/serializers.py", line 337, in read_int >> raise EOFError >> EOFError >> >> at >> org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:115) >> at >> org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:145) >> at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:78) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) >> at org.apache.spark.scheduler.Task.run(Task.scala:51) >> at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> 14/07/30 01:46:29 ERROR TaskSchedulerImpl: Lost executor 19 on >> ip-10-190-171-217.ec2.internal: remote Akka client disassociated >> >> How do I debug this? I’m using 1.0.2-rc1 deployed to EC2. >> >> Nick >> >> >> ________________________________ >> View this message in context: How do you debug a PythonException? >> Sent from the Apache Spark User List mailing list archive at Nabble.com. > >