Li Haoyi created SPARK-22547:
--------------------------------

             Summary: Don't include executor ID in metrics name
                 Key: SPARK-22547
                 URL: https://issues.apache.org/jira/browse/SPARK-22547
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.2.0
            Reporter: Li Haoyi


Spark's metrics system prefixes all metrics collected from executors with the 
executor ID. 

* 
https://github.com/apache/spark/blob/fccb337f9d1e44a83cfcc00ce33eae1fad367695/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L136

This behavior causes two problems: 

* it's not possible to aggregate over executors (since the metric name is 
different for each host) 
* upstream metrics systems like Ganglia or Prometheus are put under high load 
because of the number of time series to store.

By removing the `executorId` from the name of the metric we register, that 
solves both the above problems



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to