vinothchandar commented on a change in pull request #2494:
URL: https://github.com/apache/hudi/pull/2494#discussion_r566564182
##########
File path:
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java
##########
@@ -188,41 +196,51 @@ private synchronized void openFileSliceIfNeeded() throws
IOException {
// Load the schema
Schema schema =
HoodieAvroUtils.addMetadataFields(HoodieMetadataRecord.getClassSchema());
- logRecordScanner = new
HoodieMetadataMergedLogRecordScanner(metaClient.getFs(), metadataBasePath,
- logFilePaths, schema, latestMetaInstantTimestamp,
MAX_MEMORY_SIZE_IN_BYTES, BUFFER_SIZE,
+ HoodieMetadataMergedLogRecordScanner logRecordScanner = new
HoodieMetadataMergedLogRecordScanner(metaClient.getFs(),
+ metadataBasePath, logFilePaths, schema,
latestMetaInstantTimestamp, MAX_MEMORY_SIZE_IN_BYTES, BUFFER_SIZE,
spillableMapDirectory, null);
LOG.info("Opened metadata log files from " + logFilePaths + " at instant "
+ latestInstantTime
+ "(dataset instant=" + latestInstantTime + ", metadata instant=" +
latestMetaInstantTimestamp + ")");
metrics.ifPresent(metrics ->
metrics.updateMetrics(HoodieMetadataMetrics.SCAN_STR, timer.endTimer()));
+
+ if (metadataConfig.enableReuse()) {
+ // cache for later reuse
+ cachedBaseFileReader = baseFileReader;
+ cachedLogRecordScanner = logRecordScanner;
+ }
+
+ return Pair.of(baseFileReader, logRecordScanner);
}
- private void closeIfNeeded() {
+ private void closeIfNeeded(Pair<HoodieFileReader,
HoodieMetadataMergedLogRecordScanner> readers) {
Review comment:
I prefer the previous approach for readability. i.e have it just close
the member variables. If you disagree, may be we can chat about why you think
this is better for reading. I did face this issue when I originally read the
code here
##########
File path:
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java
##########
@@ -188,41 +196,51 @@ private synchronized void openFileSliceIfNeeded() throws
IOException {
// Load the schema
Schema schema =
HoodieAvroUtils.addMetadataFields(HoodieMetadataRecord.getClassSchema());
- logRecordScanner = new
HoodieMetadataMergedLogRecordScanner(metaClient.getFs(), metadataBasePath,
- logFilePaths, schema, latestMetaInstantTimestamp,
MAX_MEMORY_SIZE_IN_BYTES, BUFFER_SIZE,
+ HoodieMetadataMergedLogRecordScanner logRecordScanner = new
HoodieMetadataMergedLogRecordScanner(metaClient.getFs(),
+ metadataBasePath, logFilePaths, schema,
latestMetaInstantTimestamp, MAX_MEMORY_SIZE_IN_BYTES, BUFFER_SIZE,
spillableMapDirectory, null);
LOG.info("Opened metadata log files from " + logFilePaths + " at instant "
+ latestInstantTime
+ "(dataset instant=" + latestInstantTime + ", metadata instant=" +
latestMetaInstantTimestamp + ")");
metrics.ifPresent(metrics ->
metrics.updateMetrics(HoodieMetadataMetrics.SCAN_STR, timer.endTimer()));
+
+ if (metadataConfig.enableReuse()) {
Review comment:
We are sort of creating two code paths again for reuse and non-reuse.
Can we please go back to just always initializing the member variable here
always?
then `closeIfNeeded()` can continue to work with members alone.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]