danny0405 commented on code in PR #12982:
URL: https://github.com/apache/hudi/pull/12982#discussion_r2027975386


##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java:
##########
@@ -177,31 +178,39 @@ public List<HoodieFileGroup> 
addFilesToView(List<StoragePathInfo> statuses) {
    * Adds the provided statuses into the file system view for a single 
partition, and also caches it inside this object.
    */
   public List<HoodieFileGroup> addFilesToView(String partitionPath, 
List<StoragePathInfo> statuses) {
-    HoodieTimer timer = HoodieTimer.start();
-    List<HoodieFileGroup> fileGroups = buildFileGroups(partitionPath, 
statuses, visibleCommitsAndCompactionTimeline, true);
-    long fgBuildTimeTakenMs = timer.endTimer();
-    timer.startTimer();
-    // Group by partition for efficient updates for both InMemory and 
DiskBased structures.
-    
fileGroups.stream().collect(Collectors.groupingBy(HoodieFileGroup::getPartitionPath))
-        .forEach((partition, value) -> {
-          if (!isPartitionAvailableInStore(partition)) {
-            if (bootstrapIndex.useIndex()) {
-              try (BootstrapIndex.IndexReader reader = 
bootstrapIndex.createReader()) {
-                LOG.info("Bootstrap Index available for partition {}", 
partition);
-                List<BootstrapFileMapping> sourceFileMappings =
-                    reader.getSourceFileMappingForPartition(partition);
-                addBootstrapBaseFileMapping(sourceFileMappings.stream()
-                    .map(s -> new BootstrapBaseFileMapping(new 
HoodieFileGroupId(s.getPartitionPath(),
-                        s.getFileId()), s.getBootstrapFileStatus())));
+    try {
+      writeLock.lock();

Review Comment:
   @codope @the-other-tim-brown
   
   affects of this patch:
   
   1. The change in this patch add write lock to each of the read APIs, that 
would block the #sync of the whole view, if we assume read operations are more 
frequent than write, then the #sync(most of the query APIs trigger at the very 
first place) and the query requests would block more frequently
   2. the patch also add write lock then read lock for each read API, this does 
not really work:
   
   ```diff
     // The query from thread2 could clean the state updated by thread1.
     thread1: --- wl-start --- update state --- wl-end --- rl-start 
-------------------- query------- rl-end -
     thread2: -------------------------------------------------- wl-start --- 
sync() ------- wl-end --------
   ```
   3. Just ensure the atomicity of the underneath map is not enough, we need to 
ensure the read API entegrity just like 2 mentioned.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to