Usually this is just the sign that one of the executors quit unexpectedly
which explains the dead executors you see in the ui. The next step is
usually to go and look at those executor logs and see if there's any reason
for the termination. if you end up seeing an abrupt truncation of the log
that usually means the out of memory killer shut down the process.

At that point it means that although you set the RAM to a very high-level
the operating system was unable to service a malloc call when it was
important. This means that you probably need to run with a smaller heap
size because there wasn't enough working ram to handle the heap requested.

If the log ends with some other kind of exception then you need to look
into why that occured.

On Fri, Jul 24, 2020, 7:42 AM Amit Sharma <resolve...@gmail.com> wrote:

> Hi All, sometimes i get this error in spark logs. I notice few executors
> are shown as dead in the executor tab during this error. Although my job
> get success. Please help me out the root cause of this issue. I have 3
> workers with 30 cores each and 64 GB RAM each. My job uses 3 cores per
> executor and uses a total of 63 cores and 4GB RAM per executor.
>
> Remote RPC client disassociated. Likely due to containers exceeding
> thresholds, or network issues. Check driver logs for WARN messages
>

Reply via email to