hudi-agent commented on code in PR #18803:
URL: https://github.com/apache/hudi/pull/18803#discussion_r3281567418
##########
hudi-client/hudi-java-client/src/test/java/org/apache/hudi/testutils/TestHoodieMetadataBase.java:
##########
@@ -308,6 +308,7 @@ protected HoodieWriteConfig.Builder
getWriteConfigBuilder(HoodieFailedWritesClea
.enable(useFileListingMetadata)
.withMetadataIndexColumnStats(false)
.enableMetrics(enableMetrics)
+ .enableDetailedMetadataMetrics(enableMetrics)
Review Comment:
🤖 There's a parallel `TestHoodieMetadataBase` under
`hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieMetadataBase.java`
(line ~346) that also wires `.enableMetrics(enableMetrics)` but isn't updated
to set `.enableDetailedMetadataMetrics(...)`. As a result,
`TestHoodieBackedMetadata.testMetadataMetrics()` (which asserts on
`STAT_COUNT_BASE_FILES`, `STAT_TOTAL_BASE_FILE_SIZE`, etc.) will fail since
`updateSizeMetrics` is now skipped. Could you mirror the same enable in the
Spark base?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1959,7 +1959,9 @@ protected void commitInternal(String instantTime,
Map<String, HoodieData<HoodieR
metadataMetaClient.reloadActiveTimeline();
// Update total size of the metadata and count of base/log files
- metrics.ifPresent(m -> m.updateSizeMetrics(metadataMetaClient, metadata,
dataMetaClient.getTableConfig().getMetadataPartitions()));
+ if (metrics.isPresent() &&
metrics.get().shouldEnableDetailedMetadataMetrics()) {
Review Comment:
🤖 nit: since we're already dealing with an `Optional`, could you use
`metrics.filter(m -> m.shouldEnableDetailedMetadataMetrics()).ifPresent(m ->
m.updateSizeMetrics(...))` here? It avoids the double `get()` and keeps the
same idiomatic style as the original `ifPresent` call.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -683,6 +683,13 @@ public final class HoodieMetadataConfig extends
HoodieConfig {
+ "with the actual record count stored in the metadata table. This
validation runs in a distributed manner "
+ "using the compute engine. Disabled by default as it adds overhead
to the initialization process.");
+ public static final ConfigProperty<Boolean> ENABLE_DETAILED_METRICS =
ConfigProperty
+ .key(METADATA_PREFIX + ".enable.detailed.metrics")
+ .defaultValue(false)
+ .markAdvanced()
Review Comment:
🤖 `sinceVersion("0.14.2")` looks off — nearby configs use 1.1.0/1.2.0 and
the current dev version is 1.3.0-SNAPSHOT. Could you bump this to whatever
release this is targeting?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMetrics.java:
##########
@@ -101,6 +103,10 @@ public Map<String, String> getStats(boolean detailed,
HoodieTableMetaClient meta
}
}
+ public boolean shouldEnableDetailedMetadataMetrics() {
Review Comment:
🤖 nit: `shouldEnableDetailedMetadataMetrics()` reads more like a
decision-making/policy method than a simple boolean getter. Since the flag is
already stored, `isDetailedMetricsEnabled()` would be more idiomatic — and
since this is already in `HoodieMetadataMetrics`, dropping the redundant
`MetadataMetrics` suffix keeps it concise.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMetrics.java:
##########
@@ -85,10 +85,12 @@ public class HoodieMetadataMetrics implements Serializable {
private final transient MetricRegistry metricsRegistry;
private final transient Metrics metrics;
+ private final boolean enabledDetailedMetadataMetrics;
Review Comment:
🤖 nit: `enabledDetailedMetadataMetrics` reads as past-participle ("was
enabled") rather than a clear boolean predicate. Something like
`detailedMetricsEnabled` would be more conventional for a boolean field and
align with typical Java naming (e.g. `metricsEnabled`, `detailedEnabled`).
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]