Thanks, my case seems not caused by GC, cpu is pretty low and both YGC and FGC seems behavior quite normal. Hmm, weird.
Best Regards, Raymond Liu From: Aaron Davidson [mailto:[email protected]] Sent: Saturday, November 02, 2013 12:07 AM To: [email protected] Subject: Re: Executor could not connect to Driver? I've seen this happen before due to the driver doing long GCs when the driver machine was heavily memory-constrained. For this particular issue, simply freeing up memory used by other applications fixed the problem. On Fri, Nov 1, 2013 at 12:14 AM, Liu, Raymond <[email protected]<mailto:[email protected]>> wrote: Hi I am encounter an issue that the executor actor could not connect to Driver actor. But I could not figure out what's the reason. Say the Driver actor is listening on :35838 root@sr434:~# netstat -lpv Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 *:50075 *:* LISTEN 18242/java tcp 0 0 *:50020 *:* LISTEN 18242/java tcp 0 0 *:ssh *:* LISTEN 1325/sshd tcp 0 0 *:50010 *:* LISTEN 18242/java tcp6 0 0 sr434:35838 [::]:* LISTEN 9420/java tcp6 0 0 [::]:40390 [::]:* LISTEN 9420/java tcp6 0 0 [::]:4040 [::]:* LISTEN 9420/java tcp6 0 0 [::]:8040 [::]:* LISTEN 28324/java tcp6 0 0 [::]:60712 [::]:* LISTEN 28324/java tcp6 0 0 [::]:8042 [::]:* LISTEN 28324/java tcp6 0 0 [::]:34028 [::]:* LISTEN 9420/java tcp6 0 0 [::]:ssh [::]:* LISTEN 1325/sshd tcp6 0 0 [::]:45528 [::]:* LISTEN 9420/java tcp6 0 0 [::]:13562 [::]:* LISTEN 28324/java while the executor driver report errors as below : 13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler 13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver terminated or disconnected! Shutting down. Any idea? Best Regards, Raymond Liu
