GitHub user mukulmurthy opened a pull request:
https://github.com/apache/spark/pull/22473
[SPARK-25449][CORE] Heartbeat shouldn't include accumulators for zero
metrics
## What changes were proposed in this pull request?
Heartbeat shouldn't include accumulators for zero metrics.
Heartbeats sent from executors to the driver every 10 seconds contain
metrics and are generally on the order of a few KBs. However, for large jobs
with lots of tasks, heartbeats can be on the order of tens of MBs, causing
tasks to die with heartbeat failures. We can mitigate this by not sending zero
metrics to the driver.
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mukulmurthy/oss-spark 25449-heartbeat
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22473.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22473
----
commit 3e0d9536512300d27201e1d5cc4d9b5755a47871
Author: Mukul Murthy <mukul.murthy@...>
Date: 2018-09-17T21:55:21Z
Don't send zero accumulators for metrics in heartbeat
commit 3cf88a4ab34064074d42f5daa3a448e8f9def649
Author: Mukul Murthy <mukul.murthy@...>
Date: 2018-09-19T18:40:47Z
add tests
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]