the-other-tim-brown commented on code in PR #12982:
URL: https://github.com/apache/hudi/pull/12982#discussion_r2027280447


##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java:
##########
@@ -177,31 +178,39 @@ public List<HoodieFileGroup> 
addFilesToView(List<StoragePathInfo> statuses) {
    * Adds the provided statuses into the file system view for a single 
partition, and also caches it inside this object.
    */
   public List<HoodieFileGroup> addFilesToView(String partitionPath, 
List<StoragePathInfo> statuses) {
-    HoodieTimer timer = HoodieTimer.start();
-    List<HoodieFileGroup> fileGroups = buildFileGroups(partitionPath, 
statuses, visibleCommitsAndCompactionTimeline, true);
-    long fgBuildTimeTakenMs = timer.endTimer();
-    timer.startTimer();
-    // Group by partition for efficient updates for both InMemory and 
DiskBased structures.
-    
fileGroups.stream().collect(Collectors.groupingBy(HoodieFileGroup::getPartitionPath))
-        .forEach((partition, value) -> {
-          if (!isPartitionAvailableInStore(partition)) {
-            if (bootstrapIndex.useIndex()) {
-              try (BootstrapIndex.IndexReader reader = 
bootstrapIndex.createReader()) {
-                LOG.info("Bootstrap Index available for partition {}", 
partition);
-                List<BootstrapFileMapping> sourceFileMappings =
-                    reader.getSourceFileMappingForPartition(partition);
-                addBootstrapBaseFileMapping(sourceFileMappings.stream()
-                    .map(s -> new BootstrapBaseFileMapping(new 
HoodieFileGroupId(s.getPartitionPath(),
-                        s.getFileId()), s.getBootstrapFileStatus())));
+    try {
+      writeLock.lock();

Review Comment:
   Today, we do not use the write lock when writing but rather a read lock so 
this is new. For the in-memory only version there is a `ConcurrentHashMap` so 
it ends up working safely but the spillable map based map will have issues.
   
   @codope the other option is to require the thread-safety come from the map 
implementations as you suggest and we can remove the read and write lock 
entirely. The `RocksDbBasedFileSystemView` is backed by a `RocksDB` instance 
which notes `It is safe for concurrent access from multiple threads without any 
external synchronization`. 
   
   Really just depends on the approach the team wants, I will put up a separate 
branch so we can more easily compare and contrast the approaches.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to