[ 
https://issues.apache.org/jira/browse/SPARK-26329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718017#comment-16718017
 ] 

Imran Rashid commented on SPARK-26329:
--------------------------------------

[~wypoon] and I were just chatting about this, and we realized this is even 
more complicated.  We're currently logging the aggregated metrics with a stage 
end event -- but what if a heartbeat, which includes a peak metric for stage 1, 
comes after stage 1 completes?  Now we won't log the updated metric anywhere.  
Also, even in the current approach, I realized we will ignore metrics from 
speculative tasks which complete after the stage completes (maybe that is fine).

One option would be to *also* send these metrics at the task end, so we know 
we'd always get the value at the task end.  We'd still want to send them in the 
heartbeat, so that you get the metric values if the executor dies.

We need to think through the options here a bit more ...

> ExecutorMetrics should poll faster than heartbeats
> --------------------------------------------------
>
>                 Key: SPARK-26329
>                 URL: https://issues.apache.org/jira/browse/SPARK-26329
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, Web UI
>    Affects Versions: 3.0.0
>            Reporter: Imran Rashid
>            Priority: Major
>
> We should allow faster polling of the executor memory metrics (SPARK-23429 / 
> SPARK-23206) without requiring a faster heartbeat rate.  We've seen the 
> memory usage of executors pike over 1 GB in less than a second, but 
> heartbeats are only every 10 seconds (by default).  Spark needs to enable 
> fast polling to capture these peaks, without causing too much strain on the 
> system.
> In the current implementation, the metrics are polled along with the 
> heartbeat, but this leads to a slow rate of polling metrics by default.  If 
> users were to increase the rate of the heartbeat, they risk overloading the 
> driver on a large cluster, with too many messages and too much work to 
> aggregate the metrics.  But, the executor could poll the metrics more 
> frequently, and still only send the *max* since the last heartbeat for each 
> metric.  This keeps the load on the driver the same, and only introduces a 
> small overhead on the executor to grab the metrics and keep the max.
> The downside of this approach is that we still need to wait for the next 
> heartbeat for the driver to be aware of the new peak.   If the executor dies 
> or is killed before then, then we won't find out.  A potential future 
> enhancement would be to send an update *anytime* there is an increase by some 
> percentage, but we'll leave that out for now.
> Another possibility would be to change the metrics themselves to track peaks 
> for us, so we don't have to fine-tune the polling rate.  For example, some 
> jvm metrics provide a usage threshold, and notification: 
> https://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold
> But, that is not available on all metrics.  This proposal gives us a generic 
> way to get a more accurate peak memory usage for *all* metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to