[
https://issues.apache.org/jira/browse/SPARK-25929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17034167#comment-17034167
]
Paulius Dambrauskas commented on SPARK-25929:
---------------------------------------------
After adding native support for Prometheus in Spark 3.0
[https://github.com/apache/spark/pull/25769|https://github.com/apache/spark/pull/25769)]
this issue seems to be even more relevant. Current Prometheus implementation
exposes metrics in this format:
{code:java}
metrics_app_20190911211130_0000_driver_BlockManager_disk_diskSpaceUsed_MB_Value
0 metrics_app_20190911211130_0000_driver_BlockManager_memory_maxMem_MB_Value
732
metrics_app_20190911211130_0000_driver_BlockManager_memory_maxOffHeapMem_MB_Value
0
metrics_app_20190911211130_0000_driver_BlockManager_memory_maxOnHeapMem_MB_Value
732
metrics_app_20190911211130_0000_driver_BlockManager_memory_memUsed_MB_Value 0
{code}
As metric name contains application id, using those metrics without relabeling
them, gets quite complicated.
To make some meaningful graphs, you have to build youre queries with wildcards
in their names. Example:
{code:java}
rate({__name__=~"metrics_.*StreamingMetrics_streaming_totalProcessedRecords_Value"}[2m])
{code}
This becomes a problem when you run huge amounts of Spark jobs, since
Prometheus is not designed to work like that. App Ids and other dynamic stuff
should go to metric labels.
What would be possible solutions to solve this problem?
Current workaround for Prometheus case is metrics relabeling.
> Support metrics with tags
> -------------------------
>
> Key: SPARK-25929
> URL: https://issues.apache.org/jira/browse/SPARK-25929
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 2.3.2
> Reporter: John Zhuge
> Priority: Major
>
> For better integration with DBs that support tags/labels, e.g., InfluxDB,
> Prometheus, Atlas, etc.
> We should continue to support the current Graphite-style metrics.
> Dropwizard Metrics v5 supports tags. It has been in RC status since Feb.
> Currently
> `[5.0.0-rc2|https://github.com/dropwizard/metrics/releases/tag/v5.0.0-rc2]`
> is in Maven.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]