yuecong commented on issue #26060: [SPARK-29400][CORE] Improve 
PrometheusResource to use labels
URL: https://github.com/apache/spark/pull/26060#issuecomment-539804591
 
 
   @dongjoon-hyun Thanks for fixing this. 
   I have several questions on this.
   
   1. Short-lived metrics
   As Prometheus uses pull model, how do you recommend people to use these 
metrics for some executors who get shut down immediately?  Also how this will 
work for some short-lived(e.g. shorter than one Prometheus scrape interval, 
usually it is 30s) spark application?
   Check this [blog]( 
https://www.metricfire.com/prometheus-tutorials/prometheus-monitoring-101) 
about short-lived metrics for Prometheus.
   
   2. Cardinality
    It looks like you are using app_id as one of the app_id, which will 
increase the cardinality for Prometheus metrics. See more information about 
prometheus's cardinality issue as 
[here](https://www.robustperception.io/cardinality-is-key) as well as this 
[doc](https://prometheus.io/docs/practices/naming/#labels)
   
   If a user uses a central Prometheus server to scrape its spark application 
with this PR. for each time, it has a new Spark application, it will have N 
metrics and assume it has M workers on average. This will cause a heavy load 
for a traditional Prometheus server. There are several 
solutions([M3](https://eng.uber.com/m3/), 
[Cortex](https://www.cncf.io/blog/2018/12/18/cortex-a-multi-tenant-horizontally-scalable-prometheus-as-a-service/),
 [Thanos](https://improbable.io/blog/thanos-prometheus-at-scale)) to address 
this issue, but we should make it clear about the cardinality for users to use 
such metrics.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to