Myself and others on here encounter situations where tasks get stuck on the last stage and the entire job just hangs. No exceptions or nothing. Most likely a deadlock somewhere inside spark.
Sent from my Verizon Wireless 4G LTE smartphone -------- Original message -------- From: Andy Max <andyma...@gmail.com> Date: 02/11/2016 2:44 PM (GMT-05:00) To: Darren Govoni <dar...@ontrenet.com> Cc: user@spark.apache.org Subject: Re: Spark workers disconnecting on 1.5.2 No, ours are running on Docker containers spread across few physical servers. Databricks runs their service on AWS. Wonder if they are seeing this issues. What other problems are you seeing with 1.5.2? We are seeing some executors get stuck in loading state. On Thu, Feb 11, 2016 at 11:16 AM, Darren Govoni <dar...@ontrenet.com> wrote: I see this too. Might explain some other serious problems we're having with 1.5.2 Is your cluster in AWS? Sent from my Verizon Wireless 4G LTE smartphone -------- Original message -------- From: Andy Max <andyma...@gmail.com> Date: 02/11/2016 2:12 PM (GMT-05:00) To: user@spark.apache.org Subject: Spark workers disconnecting on 1.5.2 I launched a 4 node Spark 1.5.2 cluster. No activity for a day or so. Now noticed that few of the workers are disconnected. Don't see this issue on Spark 1.4 or Spark 1.3. Would appreciate any pointers. Thx