[ 
https://issues.apache.org/jira/browse/SPARK-25285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Canali updated SPARK-25285:
--------------------------------
    Description: 
The motivation for these additional metrics is to help in troubleshooting and 
monitoring task execution on a cluster. Currently available metrics include 
executor threadpool metrics for task completed and for active tasks. The 
addition of threadpool tasStarted metric will allow for example to collect info 
on the (approximate) number of failed tasks by computing the difference thread 
started – (active threads + completed tasks and/or successfully finished tasks).
 The proposed metric finishedTasks is also intended for this type of 
troubleshooting. The difference between finshedTasks and 
threadpool.completeTasks, is that the latter is a (dropwizard library) gauge 
taken from the threadpool, while the former is a (dropwizard) counter computed 
in the [[Executor]] class, when a task successfully finishes, together with 
several other task metrics counters.
 Note, there are similarities with some of the metrics introduced in 
SPARK-24398, however there are key differences, coming from the fact that this 
PR concerns the executor source, therefore providing metric values per executor 
+ metric values do not require to pass through the listerner bus in this case.

  was:
The motivation for these additional metrics is to help in troubleshooting 
situations when tasks fail, are killed and/or restarted. Currently available 
metrics include executor threadpool metrics for task completed and for active 
tasks. The addition of threadpool tasStarted metric will allow for example to 
collect info on the (approximate) number of failed tasks by computing the 
difference thread started – (active threads + completed tasks and/or 
successfully completed tasks).

The proposed metric successfulTasks is also intended for this type of 
troubleshooting. The difference between  successfulTasks and 
threadpool.completeTasks, is that the latter is a (dropwizard library) gauge 
taken from the threadpool, while the former is a (dropwizard) counter computed 
in the [[Executor]] class, when a task successfully completes, together with 
several other task metrics counters.


> Add executor task metrics to track the number of tasks started and of tasks 
> successfully completed
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25285
>                 URL: https://issues.apache.org/jira/browse/SPARK-25285
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Luca Canali
>            Priority: Minor
>
> The motivation for these additional metrics is to help in troubleshooting and 
> monitoring task execution on a cluster. Currently available metrics include 
> executor threadpool metrics for task completed and for active tasks. The 
> addition of threadpool tasStarted metric will allow for example to collect 
> info on the (approximate) number of failed tasks by computing the difference 
> thread started – (active threads + completed tasks and/or successfully 
> finished tasks).
>  The proposed metric finishedTasks is also intended for this type of 
> troubleshooting. The difference between finshedTasks and 
> threadpool.completeTasks, is that the latter is a (dropwizard library) gauge 
> taken from the threadpool, while the former is a (dropwizard) counter 
> computed in the [[Executor]] class, when a task successfully finishes, 
> together with several other task metrics counters.
>  Note, there are similarities with some of the metrics introduced in 
> SPARK-24398, however there are key differences, coming from the fact that 
> this PR concerns the executor source, therefore providing metric values per 
> executor + metric values do not require to pass through the listerner bus in 
> this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to