nsivabalan commented on code in PR #9037:
URL: https://github.com/apache/hudi/pull/9037#discussion_r1239107332
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroHFileReader.java:
##########
@@ -120,20 +120,14 @@ public HoodieAvroHFileReader(Path path, HFile.Reader
reader, Option<Schema> sche
.orElseGet(() -> Lazy.lazily(() -> fetchSchema(reader)));
}
- @Override
- public Option<HoodieRecord<IndexedRecord>> getRecordByKey(String key, Schema
readerSchema) throws IOException {
- synchronized (sharedScannerLock) {
- return fetchRecordByKeyInternal(sharedScanner, key, getSchema(),
readerSchema)
- .map(data -> unsafeCast(new HoodieAvroIndexedRecord(data)));
- }
- }
@Override
public ClosableIterator<HoodieRecord<IndexedRecord>>
getRecordsByKeysIterator(List<String> keys, Schema schema) throws IOException {
// We're caching blocks for this scanner to minimize amount of traffic
// to the underlying storage as we fetched (potentially) sparsely
distributed
// keys
HFileScanner scanner = getHFileScanner(reader, true);
+ scanner.seekTo(); // places the cursor at the beginning of the first data
block.
ClosableIterator<IndexedRecord> iterator = new
RecordByKeyIterator(scanner, keys, getSchema(), schema);
Review Comment:
We have callers who might do full scan and some callers might do on-demand
or prefix based search. I don't want to touch the full scan patch.
just optimizing the on-demand flows where we will do reseek.
So, prefer to keep it at the caller.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]