Aaron Tokhy created SPARK-12514:
-----------------------------------
Summary: Spark MetricsSystem can fill disks/cause OOMs when using
GangliaSink
Key: SPARK-12514
URL: https://issues.apache.org/jira/browse/SPARK-12514
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.5.2
Reporter: Aaron Tokhy
Priority: Minor
The MetricsSystem implementation in Spark generates unique metric names for
each spark application that has been submitted (to a YARN cluster, for
example). This can be problematic for certain metrics environments, like
Ganglia.
This creates metric names that look like the following (for each submitted
application):
application_1450753701508_0001.driver.ExecutorAllocationManager.executors.numberAllExecutors
On Spark clusters where thousands of applications are submitted, some metrics
will eventually cause Ganglia daemons to reach their memory limits (gmond), or
to run out of disk space (gmetad). This is due to the fact that some existing
metrics systems do not expect new metric names to be generated in the lifetime
of a cluster.
Ganglia as a spark metrics sink is one example of where the current
implementation can run into problems. Each new set of metrics per application
introduces a new set of RRD files that are never deleted (round robin
databases) and metrics in gmetad/gmond, which can cause the gmond aggregator's
memory usage to bloat over time, and gmetad to generate new round robin
databases for every new set of metrics, per application. These round robin
databases are permanent, so each new set of metrics will introduce files that
would never be cleaned up.
So the MetricsSystem may need to account for metrics sinks that have problems
with the introduction of new metrics, and buildRegistryName would have to
behave differently in this case.
https://github.com/apache/spark/blob/d83c2f9f0b08d6d5d369d9fae04cdb15448e7f0d/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L126
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]