prashantwason commented on a change in pull request #2494:
URL: https://github.com/apache/hudi/pull/2494#discussion_r565555570
##########
File path:
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java
##########
@@ -209,12 +212,23 @@ public R next() {
@Override
public Option getRecordByKey(String key, Schema readerSchema) throws
IOException {
- HFileScanner scanner = reader.getScanner(false, true);
+ byte[] value = null;
KeyValue kv = new KeyValue(key.getBytes(), null, null, null);
- if (scanner.seekTo(kv) == 0) {
- Cell c = scanner.getKeyValue();
- byte[] keyBytes = Arrays.copyOfRange(c.getRowArray(), c.getRowOffset(),
c.getRowOffset() + c.getRowLength());
- R record = getRecordFromCell(c, getSchema(), readerSchema);
+
+ synchronized (this) {
Review comment:
The concurrent calls happen when TimelineService is being used.
TimelineService is based on Javalin/Jetty which has a thread-per-request
synchronous model. Multiple threads in parallel will call in the app's router
to handle the HTTP request.
Within the org.apache.hudi.timeline.service.FileSystemViewHandler, we handle
the remote calls by code similar to --
viewManager.getFileSystemView(basePath).getLatestXXX(...)
When Metadata Table is enabled, the file system view should be
HoodieMetadataFileSystemView which uses a HoodieTableMetadata (class variable).
Hence, the HoodieTableMetadata.getAllFilesInPartition() will be called
concurrently on multiple threads. This function internally calls the
getRecordByKey().
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]