Working on the new metrics framework (https://issues.apache.org/jira/browse/HADOOP-6728 and its sub-tasks), I'd like to invite some feedback from the community on how to evolve Hadoop metrics. Here is a quick summary of changes:
1. Simplified metrics instrumentation - Declarative annotations for common use cases - Narrower interfaces for advanced use cases - Automatic JMX access to all metrics - User code reduction in all cases 2. Extensive configuration and filtering options 3. Fix broken design/implementation for parallel/multi-backend use cases 4. More efficient (both space and time) design and implementation These benefits come with a necessity to break backward compatibility in both end-user (ops) visible config file changes and API changes for plugin developers. It seems to me that we have the following paths (targeting 0.22) to transition to the new metrics framework (no major issues found in Y's scale (a few thousand nodes) testing and being deployed to production.) 1. Remove the current o.a.h.metrics package completely and upgrade all metrics in Hadoop to use the new framework. - The current metrics APIs are marked as "Public/Evolving". - There are a few uses (notably HBase) outside Hadoop core (common, hdfs, mapreduce) - Upgrade/fix needs will be immediately apparent. - A simple band-aid would be an old hadoop metrics jar for external projects 2. Deprecate the current metrics package and upgrade all metrics in Hadoop to use the new framework. - Hadoop core would need to use new config files - External packages that depend on the old metrics framework continue working with old config files; without the benefits of the new framework - Potential config/api confusions and support issues 3. Depreciate the current metrics package and make all metrics work with both frameworks. - Framework switchable via config variable at startup time - External dependencies can work without modification - More dev work to support dual framework setup - More potential config/api confusions and support issues 3a. Same as 3. except with an option to run both framework at the same time with obviously more runtime overhead. I prefer the first path, obviously. OTOH, we could be convinced for other paths if there are real compelling reasons as well. __Luke