danny0405 commented on code in PR #9037:
URL: https://github.com/apache/hudi/pull/9037#discussion_r1239266143
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroHFileReader.java:
##########
@@ -120,20 +120,14 @@ public HoodieAvroHFileReader(Path path, HFile.Reader
reader, Option<Schema> sche
.orElseGet(() -> Lazy.lazily(() -> fetchSchema(reader)));
}
- @Override
- public Option<HoodieRecord<IndexedRecord>> getRecordByKey(String key, Schema
readerSchema) throws IOException {
- synchronized (sharedScannerLock) {
- return fetchRecordByKeyInternal(sharedScanner, key, getSchema(),
readerSchema)
- .map(data -> unsafeCast(new HoodieAvroIndexedRecord(data)));
- }
- }
@Override
public ClosableIterator<HoodieRecord<IndexedRecord>>
getRecordsByKeysIterator(List<String> keys, Schema schema) throws IOException {
// We're caching blocks for this scanner to minimize amount of traffic
// to the underlying storage as we fetched (potentially) sparsely
distributed
// keys
HFileScanner scanner = getHFileScanner(reader, true);
+ scanner.seekTo(); // places the cursor at the beginning of the first data
block.
ClosableIterator<IndexedRecord> iterator = new
RecordByKeyIterator(scanner, keys, getSchema(), schema);
Review Comment:
For full scan, we should always seek to the start again? I just though it is
hard to maintain the `seekTo` for each caller.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]