prashantwason commented on code in PR #18179:
URL: https://github.com/apache/hudi/pull/18179#discussion_r2914897732


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java:
##########
@@ -86,6 +89,12 @@ public class HoodieSparkEngineContext extends 
HoodieEngineContext {
   private final SQLContext sqlContext;
   private final Map<HoodieDataCacheKey, List<Integer>> cachedRddIds = new 
HashMap<>();
 
+  /**
+   * Map of all distributed registries created via getMetricRegistry().
+   * This map is passed to Spark executors to make the registries available 
there.
+   */
+  private static final Map<String, Registry> DISTRIBUTED_REGISTRY_MAP = new 
ConcurrentHashMap<>();

Review Comment:
   The \`DISTRIBUTED_REGISTRY_MAP\` is static and lives for the lifetime of the 
JVM. This is intentional — registries are long-lived singletons that correspond 
to Spark accumulators (which are also tied to the SparkContext lifetime). Since 
Hudi tables are typically long-lived in a Spark application, the number of 
entries is bounded by the number of distinct (tableName, registryName) pairs, 
which is small (currently 2 registries per table: \`HoodieWrapperFileSystem\` 
and \`HoodieWrapperFileSystemMetaFolder\`).
   
   If we need cleanup for cases where tables are dropped/removed, we could add 
a \`removeMetricRegistry()\` method in a follow-up, but for now the memory 
footprint is negligible.



##########
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java:
##########
@@ -67,10 +117,27 @@ static Registry getRegistry(String registryName, String 
clazz) {
    * @return {@link Map} of metrics name and value
    */
   static Map<String, Long> getAllMetrics(boolean flush, boolean 
prefixWithRegistryName) {
+    return getAllMetrics(flush, prefixWithRegistryName, Option.empty());
+  }
+
+  /**
+   * Get all registered metrics.
+   *
+   * If a Registry did not have a prefix in its name, the commonPrefix is 
pre-pended to its name.
+   *
+   * @param flush clear all metrics after this operation.
+   * @param prefixWithRegistryName prefix each metric name with the registry 
name.
+   * @param commonPrefix prefix to use if the registry name does not have a 
prefix itself.
+   * @return {@link Map} of metrics name and value
+   */
+  static Map<String, Long> getAllMetrics(boolean flush, boolean 
prefixWithRegistryName, Option<String> commonPrefix) {

Review Comment:
   Sure, happy to sync up. Let me know when works for you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to