cshuo commented on code in PR #18690: URL: https://github.com/apache/hudi/pull/18690#discussion_r3199870872
########## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkMetricsUtils.java: ########## @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.metrics; + +import org.apache.flink.metrics.MetricGroup; +import org.apache.hudi.metadata.HoodieBackedTableMetadata; + +import java.util.HashMap; +import java.util.Map; +import java.util.concurrent.atomic.AtomicReference; +import java.util.function.Supplier; + +/** + * Utils for Flink metrics registration. + */ +public class FlinkMetricsUtils { + + /** + * Registers metadata table metrics with Flink for the first time. + * Use the three-argument overload when the metadata table may be reloaded, to avoid + * duplicate gauge registration. + */ + public static void registerMetadataTableMetrics(HoodieBackedTableMetadata metadataTable, MetricGroup metricGroup) { + registerMetadataTableMetrics(metadataTable, metricGroup, null); + } + + /** + * Registers or updates metadata table metrics in Flink's metric group. + * + * <p>On the first call ({@code handles} is {@code null} or empty) each metric name is + * registered once with Flink as a gauge whose value is read through an + * {@link AtomicReference}. On subsequent calls the references are swapped to point at + * the new metric objects from the reloaded metadata table, so Flink never sees a + * duplicate registration. + * + * @param metadataTable the (possibly reloaded) metadata table + * @param metricGroup Flink metric group to register gauges into + * @param handles map returned by a previous call, or {@code null} for the first call + * @return updated handles map; pass it back on the next invocation + */ + public static Map<String, AtomicReference<Supplier<Object>>> registerMetadataTableMetrics( + HoodieBackedTableMetadata metadataTable, + MetricGroup metricGroup, + Map<String, AtomicReference<Supplier<Object>>> handles) { + + Map<String, AtomicReference<Supplier<Object>>> result = handles != null ? handles : new HashMap<>(); + if (metadataTable == null) { + return result; + } + + metadataTable.getMetrics().ifPresent(m -> { Review Comment: If the metrics are already exposed in hudi metrics, can the metrics data directly used in monitoring tool? ########## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RecordLevelIndexBackend.java: ########## @@ -118,6 +125,7 @@ public void onCheckpointComplete(Correspondent correspondent, long completedChec recordIndexCache.markAsEvictable(inflightInstants.keySet().stream().min(Long::compareTo).orElse(completedCheckpointID)); this.metaClient.reloadActiveTimeline(); reloadMetadataTable(); + registerMetricsInternal(); Review Comment: Could this be simplified by registering the Flink gauges only once and having each gauge read through a `Supplier<HoodieBackedTableMetadata>`? Then `RecordLevelIndexBackend` would only need to update `this.metadataTable` on reload; the existing gauges would naturally read from the latest table without carrying metricsHandles or calling `registerMetricsInternal()` after every checkpoint. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
