Evan could you also share more logs on the error. Probably paste here or in pastebin.
Also check zookeeper logs in case you find anything. - Thanks, via mobile, excuse brevity. On Dec 22, 2015 6:01 PM, "Dirceu Semighini Filho" < dirceu.semigh...@gmail.com> wrote: > Hi Yash, > I've experienced this behavior here when the process freeze in a worker. > This mainly happen, in my case, when the worker memory was full and the > java GC wasn't able to free memory for the process. > Try to search for outofmemory error in your worker logs. > > Regards, > Dirceu > > 2015-12-22 10:26 GMT-02:00 yaoxiaohua <yaoxiao...@outlook.com>: > >> Thanks for your reply. >> >> I find spark-env.sh : >> >> SPARK_JAVA_OPTS="$SPARK_JAVA_OPTS -Dspark.akka.askTimeout=300 >> -Dspark.ui.retainedStages=1000 -Dspark.eventLog.enabled=true >> -Dspark.eventLog.dir=hdfs://sparkcluster/user/spark_history_logs >> -Dspark.shuffle.spill=false -Dspark.shuffle.manager=hash >> -Dspark.yarn.max.executor.failures=99999 -Dspark.worker.timeout=300" >> >> >> >> I just find log like this: >> >> >> INFO ClientCnxn: Client session timed out, have not heard from server in >> 40015ms for sessionid 0x351c416297a145a, closing socket connection and >> attempting reconnect >> >> Before spark2 master process shut down. >> >> I don’t see any zookeeper timeout setting . >> >> >> >> Best >> >> >> >> *From:* Yash Sharma [mailto:yash...@gmail.com] >> *Sent:* 2015年12月22日 19:55 >> *To:* yaoxiaohua >> *Cc:* user@spark.apache.org >> *Subject:* Re: Client session timed out, have not heard from server in >> >> >> >> Hi Evan, >> SPARK-9629 referred to connection issues with zookeeper. Could you check >> if its working fine in your setup. >> >> Also please share other error logs you might be getting. >> >> - Thanks, via mobile, excuse brevity. >> >> On Dec 22, 2015 5:00 PM, "yaoxiaohua" <yaoxiao...@outlook.com> wrote: >> >> Hi, >> >> I encounter a similar question, spark1.4 >> >> Master2 run some days , then give a timeout exception, then shutdown. >> >> I found a bug : >> >> https://issues.apache.org/jira/browse/SPARK-9629 >> >> >> >> >> INFO ClientCnxn: Client session timed out, have not heard from server in >> 40015ms for sessionid 0x351c416297a145a, closing socket connection and >> attempting reconnect >> >> >> >> >> >> could you tell me what do you do for this? >> >> >> >> Best Regards, >> >> Evan >> > >