Kostya Golikov created SPARK-18115:
--------------------------------------
Summary: Custom metrics Sink/Source prevent Executor from starting
Key: SPARK-18115
URL: https://issues.apache.org/jira/browse/SPARK-18115
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.6.0
Reporter: Kostya Golikov
Even though there is a semi-official support for custom metrics, in practice
specifying either custom sink or custom source will lead to NoClassDefFound
exceptions on executor side (but will be fine on driver side).
The initialization goes as:
1. CoarseGrainedExecutorBackend [prepares SparkEnv for
executor|https://github.com/apache/spark/blob/6ee28423ad1b2e6089b82af64a31d77d3552bb38/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L223]
2. SparkEnv [initializes
MetricSystem|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkEnv.scala#L338-L351].
In case of executor it also starts it
3. On [`.start()` MetricsSystem parses configuration files and creates
instances of sinks and
sources|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L101-L102].
This is where the issue actually happens -- it tries to instantiate classes
which are not there yet -- [jars and files are downloaded downstream, in
Executor|https://github.com/apache/spark/blob/6ee28423ad1b2e6089b82af64a31d77d3552bb38/core/src/main/scala/org/apache/spark/executor/Executor.scala#L257]
One of the possible solutions is to NOT start MetricSystem this fast, just like
driver does, but postpone it until jar with user defined code is fetched and
available on classpath.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]