LucaCanali commented on a change in pull request #24901: [SPARK-28091[CORE]
Extend Spark metrics system with user-defined metrics using executor plugins
URL: https://github.com/apache/spark/pull/24901#discussion_r319376590
##########
File path: core/src/main/java/org/apache/spark/ExecutorPlugin.java
##########
@@ -47,6 +48,17 @@
*/
default void init() {}
+ /**
+ * Initialize the executor plugins used to extend the Spark/Dropwizard
metrics system.
+ *
+ * <p>Each executor will, during its initialization, invoke this method on
each
+ * plugin provided in the spark.executor.metrics.plugins configuration.</p>
+ *
+ * <p>Plugins should register the data sources using the Dropwizard/codahale
API</p>
+ *
+ */
+ default void init(MetricRegistry sourceMetricsRegistry) {}
Review comment:
Thanks @vanzin for looking at this. I'll be interested to know about your
use case for using this (executor plugins for extending the metrics system) .
BTW I take the occasion to add that over the summer we have used this code a
few times for workload and performance measurements/tests, and found it quite
useful, in particular in the context of measuring I/O access time with some
custom plugins we worte ( https://github.com/cerndb/SparkExecutorPlugins ) +
custom I/O instrumentation for S3, HDFS. I have been thinking also at adding
some additional instrumentation for CPU counters or network metrics, but not
yet worked on that.
I agree that using one config for "normal" plugins and for metrics plugins
would reduce complexity and in general be preferrable. I'll appreciate a few
more deatils on your proposed changes. I guess what could be a very simple way
to merge the two plugin types, is just to pass sourceMetricsregistry to all
plugins init code. This would be a breaking change from 2.4, but maybe
acceptable for Spark 3.0? I guess there are just a few people using executor
plugins in their current form? /cc @squito
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]