nsivabalan commented on code in PR #8079:
URL: https://github.com/apache/hudi/pull/8079#discussion_r1122010419


##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java:
##########
@@ -282,6 +286,20 @@ public void reset() {
     }
   }
 
+  /**
+   * Resets the view states, which can be overridden by subclasses.  This 
reset logic is guarded
+   * by the write lock.
+   * <p>
+   * NOTE: This method SHOULD BE OVERRIDDEN for any custom logic.  DO NOT 
OVERRIDE
+   * {@link AbstractTableFileSystemView#reset} directly, which may cause stale 
file system view
+   * to be served.
+   */
+  protected void runReset() {

Review Comment:
   it is guarded Danny. 
   
   As per master. 
   
   Non metadata : 
   ```
   sync() {
      lock
        runSync {
             clean and init timeline. 
          }
     release lock 
   }
   ```
   w/ Metadata file system view
   
   sync() in MFSV {
     super.sync {
          lock
            runSync {
                 clean and init timeline. 
              }
         release lock 
      }
     tableMetadata.reset();
   }
   
   
   With this patch, here is how it is changing:
   w/o metadata.
   
   ```
   sync() {
      lock
        runSync {
             clean and init timeline. 
          }
     release lock 
   }
   ```
   no changes
   
   
   w/ MFSV
   ```
   sync() in ATFSV (not overridden in MFSV) {
          lock
            runSync { // overridden in MFSV {
                       super.runSync {
                            clean and init timeline. 
                       }
                 tableMetadata.reset();
             }
         release lock 
      }
   }
   ```
   
   
   
   



##########
hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java:
##########
@@ -187,12 +200,20 @@ private boolean syncIfLocalViewBehind(Context ctx) {
       SyncableFileSystemView view = viewManager.getFileSystemView(basePath);
       synchronized (view) {
         if (isLocalViewBehind(ctx)) {
-          HoodieTimeline localTimeline = 
viewManager.getFileSystemView(basePath).getTimeline();
-          LOG.info("Syncing view as client passed last known instant " + 
lastKnownInstantFromClient
-              + " as last known instant but server has the following last 
instant on timeline :"
-              + localTimeline.lastInstant());
-          view.sync();
-          return true;
+          try {

Review Comment:
   nope. L202 is not within synchronized Danny. hence we are introducing a read 
write lock.
   what ethan is doing in this is patch is:
   when thread1 is updating the view (view.sync()), no other threads should be 
able to read isLocalViewBehind(). If there is no other writers (view.sync()), 
multiple readers can read concurrenlty (i.e isLocalViewBehind) 



##########
hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java:
##########
@@ -151,30 +156,38 @@ public void stop() {
    * Determines if local view of table's timeline is behind that of client's 
view.
    */
   private boolean isLocalViewBehind(Context ctx) {
-    String basePath = 
ctx.queryParam(RemoteHoodieTableFileSystemView.BASEPATH_PARAM);
-    String lastKnownInstantFromClient =
-        ctx.queryParamAsClass(RemoteHoodieTableFileSystemView.LAST_INSTANT_TS, 
String.class).getOrDefault(HoodieTimeline.INVALID_INSTANT_TS);
-    String timelineHashFromClient = 
ctx.queryParamAsClass(RemoteHoodieTableFileSystemView.TIMELINE_HASH, 
String.class).getOrDefault("");
-    HoodieTimeline localTimeline =
-        
viewManager.getFileSystemView(basePath).getTimeline().filterCompletedOrMajorOrMinorCompactionInstants();
-    if (LOG.isDebugEnabled()) {
-      LOG.debug("Client [ LastTs=" + lastKnownInstantFromClient + ", 
TimelineHash=" + timelineHashFromClient
-          + "], localTimeline=" + localTimeline.getInstants());
-    }
+    try {
+      // This read lock makes sure that if the local view of the table is 
being synced,
+      // no timeline server requests should be processed or handled until the 
sync process

Review Comment:
   we call this method twice. only one caller is within synchronized



##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java:
##########
@@ -282,6 +286,20 @@ public void reset() {
     }
   }
 
+  /**
+   * Resets the view states, which can be overridden by subclasses.  This 
reset logic is guarded
+   * by the write lock.
+   * <p>
+   * NOTE: This method SHOULD BE OVERRIDDEN for any custom logic.  DO NOT 
OVERRIDE
+   * {@link AbstractTableFileSystemView#reset} directly, which may cause stale 
file system view
+   * to be served.
+   */
+  protected void runReset() {

Review Comment:
   but agree w/ naming. we can call runSyncThreadSafely or syncThreadSafely 
instead of runSync. 



##########
hudi-common/src/test/java/org/apache/hudi/metadata/HoodieBackedTestDelayedTableMetadata.java:
##########
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.metadata;
+
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.engine.HoodieEngineContext;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+/**
+ * Table metadata provided by an internal DFS backed Hudi metadata table,
+ * with an intentional delay in `reset()` to test concurrent reads and writes.
+ */
+public class HoodieBackedTestDelayedTableMetadata extends 
HoodieBackedTableMetadata {

Review Comment:
   does this really have to be in a separate file of its own. can we not embed 
within another test class where its used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to