nsivabalan commented on a change in pull request #2494:
URL: https://github.com/apache/hudi/pull/2494#discussion_r567237570
##########
File path:
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java
##########
@@ -188,41 +196,51 @@ private synchronized void openFileSliceIfNeeded() throws
IOException {
// Load the schema
Schema schema =
HoodieAvroUtils.addMetadataFields(HoodieMetadataRecord.getClassSchema());
- logRecordScanner = new
HoodieMetadataMergedLogRecordScanner(metaClient.getFs(), metadataBasePath,
- logFilePaths, schema, latestMetaInstantTimestamp,
MAX_MEMORY_SIZE_IN_BYTES, BUFFER_SIZE,
+ HoodieMetadataMergedLogRecordScanner logRecordScanner = new
HoodieMetadataMergedLogRecordScanner(metaClient.getFs(),
+ metadataBasePath, logFilePaths, schema,
latestMetaInstantTimestamp, MAX_MEMORY_SIZE_IN_BYTES, BUFFER_SIZE,
spillableMapDirectory, null);
LOG.info("Opened metadata log files from " + logFilePaths + " at instant "
+ latestInstantTime
+ "(dataset instant=" + latestInstantTime + ", metadata instant=" +
latestMetaInstantTimestamp + ")");
metrics.ifPresent(metrics ->
metrics.updateMetrics(HoodieMetadataMetrics.SCAN_STR, timer.endTimer()));
+
+ if (metadataConfig.enableReuse()) {
+ // cache for later reuse
+ cachedBaseFileReader = baseFileReader;
+ cachedLogRecordScanner = logRecordScanner;
+ }
+
+ return Pair.of(baseFileReader, logRecordScanner);
}
- private void closeIfNeeded() {
+ private void closeIfNeeded(Pair<HoodieFileReader,
HoodieMetadataMergedLogRecordScanner> readers) {
try {
if (!metadataConfig.enableReuse()) {
- close();
+ readers.getKey().close();
Review comment:
I'm sure you have tried it. But not sure why can't we achieve this w/o
two diff set of variables for both code paths? Can't we have just one set of
reader variables. one which gets closed and reopened every time (if reuse is
not enabled), or the same one getting re-used (if config is enabled)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]