[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issues. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims are too general. Not every users have that kind of gigantic size clusters. Although a few big customers have some, there are also many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? Does it run on Apache Spark? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims are too general. Not every users have that kind of gigantic size clusters. Although a few big customers have some, there are also many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? Does it run on Apache Spark? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims are too general. Not every users have that kind of gigantic size clusters. Although a few big customers have some, there are also many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims are too general. Not every users have that kind of gigantic size clusters. Although those big customers have some, there are also many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims is too general. Not every users have that kind of gigantic size clusters. Although the those big customers, there are many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims are too general. Not every users have that kind of gigantic size clusters. Although the those big customers, there are many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims is too general. Not every users have that kind of gigantic size clusters. Although the those big customers, there are many satellite small-size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539820403 Nope. Why do you collect all? It's up to your configuration. Back to the beginning, I fully understand your cluster's underlying issue. However, none of them blocks Apache Spark support `Prometheus` metric natively. 1. First, you can use the previous existing solution if you have that. (`spark.ui.prometheus.enabled` is also by default `false`). 2. Second, your claims is too general. Not every users have that kind of gigantic size clusters. I'm not sure your metric. Could you share us the size of your clusters and the number of apps and the number of metrics? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539811514 This PR doesn't collect new metrics, only exposing the existing one. So, the following is not about this PR. If you have a concern on Apache Spark *driver*, you can file a new issue on that. > If driver keeps all the metrics for all the spark applications running using the driver, Second, from `Prometheus` side, you can use `Prometheus` TTL feature, @yuecong . Have you try that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-539811514 This PR doesn't collect new metrics, only exposing the existing one. So, the following is not about this PR. > If driver keeps all the metrics for all the spark applications running using the driver, Second, from `Prometheus` side, you can use `Prometheus` TTL feature, @yuecong . Have you try that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-53980 Hi, @yuecong . Thank you for review. 1. That was true in the old Prometheus plugin. So, Apache Spark 3.0.0 exposes this Prometheus metric on the driver port, instead of the executor port. I mean you are referring `executor` instead of `driver`. Do you have a short-live Spark driver which dies in `30s`? > As Prometheus uses pull model, how do you recommend people to use these metrics for some executors who get shut down immediately? Also how this will work for some short-lived(e.g. shorter than one Prometheus scrape interval, usually it is 30s) spark application? 2. Please see this PR's description. The metric name is **unique** with cadinality 1 by using labels, `metrics_executor_rddBlocks_Count{application_id="app-20191008151625-"` > It looks like you are using app_id as one of the app_id, which will increase the cardinality for Prometheus metrics. I don't think you mean `Prometheus Dimension feature` is high-cardinality. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-53980 Hi, @yuecong . Thank you for review. 1. That was true in the old Prometheus plugin. So, Apache Spark 3.0.0 exposes this Prometheus metric on the driver port, instead of the executor port. I mean you are referring `executor` instead of `driver`. Do you have a short-live Spark driver which dies in `30s`? > As Prometheus uses pull model, how do you recommend people to use these metrics for some executors who get shut down immediately? Also how this will work for some short-lived(e.g. shorter than one Prometheus scrape interval, usually it is 30s) spark application? 2. Please see this PR's description. The metric name is **unique** with cadinality 1 by using labels, `metrics_executor_rddBlocks_Count{application_id="app-20191008151625-"` > It looks like you are using app_id as one of the app_id, which will increase the cardinality for Prometheus metrics. I don't think you mean `Prometheus Dimention feature` is high-cardinality. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels
dongjoon-hyun edited a comment on issue #26060: [SPARK-29400][CORE] Improve PrometheusResource to use labels URL: https://github.com/apache/spark/pull/26060#issuecomment-53980 Hi, @yuecong . Thank you for review. 1. That's true in the old Prometheus plugin. So, Apache Spark 3.0.0 exposes this Prometheus metric on the driver port, instead of the executor port. I mean you are referring `executor` instead of `driver`. Do you have a short-live Spark driver which dies in `30s`? > As Prometheus uses pull model, how do you recommend people to use these metrics for some executors who get shut down immediately? Also how this will work for some short-lived(e.g. shorter than one Prometheus scrape interval, usually it is 30s) spark application? 2. Please see this PR's description. The metric name is **unique** with cadinality 1 by using labels, `metrics_executor_rddBlocks_Count{application_id="app-20191008151625-"` > It looks like you are using app_id as one of the app_id, which will increase the cardinality for Prometheus metrics. I don't think you mean `Prometheus Dimention feature` is high-cardinality. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org