wombatu-kun commented on code in PR #18826:
URL: https://github.com/apache/hudi/pull/18826#discussion_r3411494734
##########
hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/functional/TestHoodieBackedMetadata.java:
##########
@@ -4379,6 +4379,104 @@ private void changeTableVersion(HoodieTableVersion
version) throws IOException {
}
}
+ /**
+ * Validates that RLI initialization estimates file group count from base
file footer metadata
+ * (instead of materializing and counting records) when min != max file
group count.
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testRecordIndexFileGroupEstimation(HoodieTableType tableType)
throws Exception {
Review Comment:
testRecordIndexWithFixedFileGroupCount cannot exercise
estimateRecordCountFromBaseFiles at all: min==max routes through the
fixed-count branch that bypasses the estimator, so it only proves the bypass.
For testRecordIndexFileGroupEstimation, the record-level min/max default to
(1,10) and 200 small records under the default record.index.max.filegroup.size
land at fileGroupCount=1, so the [1,10] assertion holds regardless of the
footer count. To make the count drive the result, set a small
withRecordIndexMaxFileGroupSizeBytes (a few KB) and assert fileGroupCount > 1
and < max, so a regression to 0 or a wrong count fails.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]