Hi, I am deploying Spark 1.6.0 using yarn-client mode in our yarn cluster. Everything works fine, except the first job is extremely slow due to executor heartbeat RPC timeout:
WARN netty.NettyRpcEndpointRef: Error sending message [message = Heartbeat I think this might be related to our cluster's network/firewall configuration, because the issue disappears if I use yarn-cluster mode to deploy spark. However, I am still wondering why the first job can continue after this timeout, and the later jobs run great without any issues. Thanks, Zhong