alexeykudinkin commented on code in PR #7642:
URL: https://github.com/apache/hudi/pull/7642#discussion_r1087254735
##########
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java:
##########
@@ -227,7 +235,11 @@ public Map<Pair<String, String>, BloomFilter>
getBloomFilters(final List<Pair<St
if (bloomFilterMetadata.isPresent()) {
if (!bloomFilterMetadata.get().getIsDeleted()) {
ValidationUtils.checkState(fileToKeyMap.containsKey(entry.getLeft()));
- final ByteBuffer bloomFilterByteBuffer =
bloomFilterMetadata.get().getBloomFilter();
+ // NOTE: We have to duplicate the [[ByteBuffer]] object here since:
+ // - Reading out [[ByteBuffer]] mutates its state
+ // - [[BloomFilterMetadata]] could be re-used, and hence
have to stay immutable
+ final ByteBuffer bloomFilterByteBuffer =
+ bloomFilterMetadata.get().getBloomFilter().duplicate();
Review Comment:
Each file has unique Bloom Filter but we're caching the records now
##########
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java:
##########
@@ -123,8 +126,10 @@ public List<String> getAllPartitionPaths() throws
IOException {
throw new HoodieMetadataException("Failed to retrieve list of
partition from metadata", e);
}
}
- return new FileSystemBackedTableMetadata(getEngineContext(), hadoopConf,
dataBasePath.toString(),
- metadataConfig.shouldAssumeDatePartitioning()).getAllPartitionPaths();
+
+ FileSystemBackedTableMetadata fileSystemBackedTableMetadata =
+ createFileSystemBackedTableMetadata();
+ return fileSystemBackedTableMetadata.getAllPartitionPaths();
Review Comment:
This changes are to avoid duplication (needed to update the way we fetch
`hadoopConf`)
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroFileReaderBase.java:
##########
@@ -37,7 +37,7 @@ abstract class HoodieAvroFileReaderBase implements
HoodieAvroFileReader {
@Override
public ClosableIterator<HoodieRecord<IndexedRecord>>
getRecordIterator(Schema readerSchema, Schema requestedSchema) throws
IOException {
ClosableIterator<IndexedRecord> iterator =
getIndexedRecordIterator(readerSchema, requestedSchema);
- return new MappingIterator<>(iterator, data -> unsafeCast(new
HoodieAvroIndexedRecord(data)));
+ return new CloseableMappingIterator<>(iterator, data -> unsafeCast(new
HoodieAvroIndexedRecord(data)));
Review Comment:
I think i know how this crippled in. Let me clean it out
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]