danny0405 commented on code in PR #9037:
URL: https://github.com/apache/hudi/pull/9037#discussion_r1238242029
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroHFileReader.java:
##########
@@ -120,20 +120,14 @@ public HoodieAvroHFileReader(Path path, HFile.Reader
reader, Option<Schema> sche
.orElseGet(() -> Lazy.lazily(() -> fetchSchema(reader)));
}
- @Override
- public Option<HoodieRecord<IndexedRecord>> getRecordByKey(String key, Schema
readerSchema) throws IOException {
- synchronized (sharedScannerLock) {
- return fetchRecordByKeyInternal(sharedScanner, key, getSchema(),
readerSchema)
- .map(data -> unsafeCast(new HoodieAvroIndexedRecord(data)));
- }
- }
@Override
public ClosableIterator<HoodieRecord<IndexedRecord>>
getRecordsByKeysIterator(List<String> keys, Schema schema) throws IOException {
// We're caching blocks for this scanner to minimize amount of traffic
// to the underlying storage as we fetched (potentially) sparsely
distributed
// keys
HFileScanner scanner = getHFileScanner(reader, true);
+ scanner.seekTo(); // places the cursor at the beginning of the first data
block.
ClosableIterator<IndexedRecord> iterator = new
RecordByKeyIterator(scanner, keys, getSchema(), schema);
Review Comment:
Should we always invoke `scanner.seekTo` just inside `getHFileScanner`,
right after we got a reader.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]