It seems your driver is getting flooded by those many executors and hence
it gets timeout. There are some configuration options like
spark.akka.timeout etc, you could try playing with those. More information
will be available here:
http://spark.apache.org/docs/latest/configuration.html

Thanks
Best Regards

On Mon, Mar 23, 2015 at 9:46 AM, Tianshuo Deng <td...@twitter.com.invalid>
wrote:

> Hi, spark users.
>
> When running a spark application with lots of executors(300+), I see
> following failures:
>
> java.net.SocketTimeoutException: Read timed out      at
> java.net.SocketInputStream.socketRead0(Native Method)      at
> java.net.SocketInputStream.read(SocketInputStream.java:152)      at
> java.net.SocketInputStream.read(SocketInputStream.java:122)      at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:235)      at
> java.io.BufferedInputStream.read1(BufferedInputStream.java:275)      at
> java.io.BufferedInputStream.read(BufferedInputStream.java:334)      at
> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)      at
> sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)      at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1324)
>     at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:583)      at
> org.apache.spark.util.Utils$.fetchFile(Utils.scala:421)      at
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:356)
>     at
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:353)
>     at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>     at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>     at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>     at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>     at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)      at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>     at 
> org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:353)
>     at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
> When I reduce the number of executors, the spark app runs fine. From the
> stack trace, it looks like that multiple executors requesting downloading
> dependencies at the same time is causing driver to timeout?
>
> Anyone experienced similar issues or has any suggestions?
>
> Thanks
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to