Re: SocketTimeout only when launching lots of executors

2015-03-23 Thread Akhil Das
It seems your driver is getting flooded by those many executors and hence
it gets timeout. There are some configuration options like
spark.akka.timeout etc, you could try playing with those. More information
will be available here:
http://spark.apache.org/docs/latest/configuration.html

Thanks
Best Regards

On Mon, Mar 23, 2015 at 9:46 AM, Tianshuo Deng td...@twitter.com.invalid
wrote:

 Hi, spark users.

 When running a spark application with lots of executors(300+), I see
 following failures:

 java.net.SocketTimeoutException: Read timed out  at
 java.net.SocketInputStream.socketRead0(Native Method)  at
 java.net.SocketInputStream.read(SocketInputStream.java:152)  at
 java.net.SocketInputStream.read(SocketInputStream.java:122)  at
 java.io.BufferedInputStream.fill(BufferedInputStream.java:235)  at
 java.io.BufferedInputStream.read1(BufferedInputStream.java:275)  at
 java.io.BufferedInputStream.read(BufferedInputStream.java:334)  at
 sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)  at
 sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)  at
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1324)
 at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:583)  at
 org.apache.spark.util.Utils$.fetchFile(Utils.scala:421)  at
 org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:356)
 at
 org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:353)
 at
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)  at
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:353)
 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

 When I reduce the number of executors, the spark app runs fine. From the
 stack trace, it looks like that multiple executors requesting downloading
 dependencies at the same time is causing driver to timeout?

 Anyone experienced similar issues or has any suggestions?

 Thanks
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




SocketTimeout only when launching lots of executors

2015-03-22 Thread Tianshuo Deng
Hi, spark users.

When running a spark application with lots of executors(300+), I see following 
failures:

java.net.SocketTimeoutException: Read timed out  at 
java.net.SocketInputStream.socketRead0(Native Method)  at 
java.net.SocketInputStream.read(SocketInputStream.java:152)  at 
java.net.SocketInputStream.read(SocketInputStream.java:122)  at 
java.io.BufferedInputStream.fill(BufferedInputStream.java:235)  at 
java.io.BufferedInputStream.read1(BufferedInputStream.java:275)  at 
java.io.BufferedInputStream.read(BufferedInputStream.java:334)  at 
sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)  at 
sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)  at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1324)
  at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:583)  at 
org.apache.spark.util.Utils$.fetchFile(Utils.scala:421)  at 
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:356)
  at 
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:353)
  at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
  at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
 at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
 at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)  
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)  at 
scala.collection.mutable.HashMap.foreach(HashMap.scala:98)  at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)  
at 
org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:353)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181)  
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
 at java.lang.Thread.run(Thread.java:745)

When I reduce the number of executors, the spark app runs fine. From the stack 
trace, it looks like that multiple executors requesting downloading 
dependencies at the same time is causing driver to timeout?

Anyone experienced similar issues or has any suggestions?

Thanks
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org