dongjoon-hyun opened a new pull request #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769 ### What changes were proposed in this pull request? This PR aims to simplify `Prometheus` support by adding `PrometheusServlet`. The main use cases are `K8s` and `Spark Standalone` cluster environments. ### Why are the changes needed? Prometheus.io is a CNCF project used widely with K8s. - https://github.com/prometheus/prometheus For `Master/Worker/Driver`, `Spark JMX Sink` and `Prometheus JMX Converter` combination is used in many cases. This PR exports natively support it for the better UX. ### Does this PR introduce any user-facing change? Yes. New web interfaces are added along with the existing JSON API. | | JSON End Point | Prometheus End Point | | ------- | ------------------------------------------- | ---------------------------------- | | Master | /metrics/master/json/ | /metrics/master/prometheus/ | | Master | /metrics/applications/json/ | /metrics/applications/prometheus/ | | Worker | /metrics/json/ | /metrics/prometheus/ | | Driver | /metrics/json/ | /metrics/prometheus/ | ``` $ bin/spark-shell ... Spark context Web UI available at http://localhost:4040 ... ``` ``` $ curl --silent http://localhost:4040/metrics/prometheus/ | head -n5 metrics_local_1568101220707_driver_BlockManager_disk_diskSpaceUsed_MB_Value 0 metrics_local_1568101220707_driver_BlockManager_memory_maxMem_MB_Value 366 metrics_local_1568101220707_driver_BlockManager_memory_maxOffHeapMem_MB_Value 0 metrics_local_1568101220707_driver_BlockManager_memory_maxOnHeapMem_MB_Value 366 metrics_local_1568101220707_driver_BlockManager_memory_memUsed_MB_Value 0 ``` ### How was this patch tested? Pass the Jenkins with the update UTs and manually connect the new end-points with `curl`. Or, run `prometheus --config.file=config.yaml` with the following configuration and see through the Prometheus UI. **config.yaml** ```yaml global: scrape_interval: 5s evaluation_interval: 15s external_labels: monitor: 'codelab-monitor' rule_files: scrape_configs: - job_name: 'spark-master' metrics_path: '/metrics/master/prometheus/' static_configs: - targets: ['localhost:8080'] - job_name: 'spark-applications' metrics_path: '/metrics/applications/prometheus/' static_configs: - targets: ['localhost:8080'] - job_name: 'spark-worker' metrics_path: '/metrics/prometheus/' static_configs: - targets: ['localhost:8081'] - job_name: 'spark-driver' metrics_path: '/metrics/prometheus/' static_configs: - targets: ['localhost:4040'] ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org