prashantwason commented on a change in pull request #2494:
URL: https://github.com/apache/hudi/pull/2494#discussion_r565555570



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java
##########
@@ -209,12 +212,23 @@ public R next() {
 
   @Override
   public Option getRecordByKey(String key, Schema readerSchema) throws 
IOException {
-    HFileScanner scanner = reader.getScanner(false, true);
+    byte[] value = null;
     KeyValue kv = new KeyValue(key.getBytes(), null, null, null);
-    if (scanner.seekTo(kv) == 0) {
-      Cell c = scanner.getKeyValue();
-      byte[] keyBytes = Arrays.copyOfRange(c.getRowArray(), c.getRowOffset(), 
c.getRowOffset() + c.getRowLength());
-      R record = getRecordFromCell(c, getSchema(), readerSchema);
+
+    synchronized (this) {

Review comment:
       The concurrent calls happen when TimelineService is being used. 
TimelineService is based on Javalin/Jetty which has a thread-per-request 
synchronous model. Multiple threads in parallel will call in the app's router 
to handle the HTTP request. 
   
   Within the org.apache.hudi.timeline.service.FileSystemViewHandler, we handle 
the remote calls by code similar to -- 
viewManager.getFileSystemView(basePath).getLatestXXX(...)
   
   When Metadata Table is enabled, the file system view should be 
HoodieMetadataFileSystemView which uses a HoodieTableMetadata (class variable). 
Hence, the HoodieTableMetadata.getAllFilesInPartition() will be called 
concurrently on multiple threads. This function internally calls the 
getRecordByKey().




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to