Gargi-jais11 commented on code in PR #8932:
URL: https://github.com/apache/ozone/pull/8932#discussion_r2306123820


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java:
##########
@@ -897,6 +901,84 @@ public DataScanResult scanData(DataTransferThrottler 
throttler, Canceler cancele
     return checker.fullCheck(throttler, canceler);
   }
 
+  @Override
+  public void copyContainerDirectory(Path destination) throws IOException {
+    readLock();
+    try {
+      // Closed/ Quasi closed containers are considered for replication by
+      // replication manager if they are under-replicated.
+      ContainerProtos.ContainerDataProto.State state =
+          getContainerData().getState();
+      if (!(state == ContainerProtos.ContainerDataProto.State.CLOSED ||
+          state == ContainerProtos.ContainerDataProto.State.QUASI_CLOSED)) {
+        throw new IllegalStateException(
+            "Only (quasi)closed containers can be exported, but " +
+                "ContainerId=" + getContainerData().getContainerID() +
+                " is in state " + state);
+      }
+
+      if (!containerData.getSchemaVersion().equals(OzoneConsts.SCHEMA_V3)) {
+        compactDB();
+        // Close DB (and remove from cache) to avoid concurrent modification
+        // while copying it.
+        BlockUtils.removeDB(containerData, config);
+      }
+
+      if (containerData.getSchemaVersion().equals(OzoneConsts.SCHEMA_V3)) {
+        // Synchronize the dump and copy operation,
+        // so concurrent copy don't get dump files overwritten.
+        // We seldom got concurrent exports for a container,
+        // so it should not influence performance much.
+        synchronized (dumpLock) {
+          BlockUtils.dumpKVContainerDataToFiles(containerData, config);
+          copyContainerToDestination(destination);
+        }
+      } else {
+        copyContainerToDestination(destination);

Review Comment:
   In this method, the goal is to copy the container's files to destination 
volume. This is a **read-only** operation from the perspective of the 
container's data and state. The code needs to guarantee that another thread 
doesn't, for example, start deleting the container while it's being copied.
   Acquiring a **readLock()** perfectly achieves this. It prevents any other 
thread from getting a **writeLock()** to modify the container, while still 
allowing other threads to perform concurrent read operations if needed. 
   Using a **writeLock()** here would be too restrictive and would 
unnecessarily block other threads from reading the container.
   
   > if multiple thread does same thing.
   
   
https://github.com/apache/ozone/blob/00ca998a966d42487f816393a59db6d20a7ac387/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java#L932-L938
   The **synchronized** block ensures that even if two threads enter the method 
concurrently, the process of dumping the RocksDB data to temporary files 
(`dumpKVContainerDataToFiles`) is serialized. This prevents the two threads 
from overwriting each other's temporary dump files, which would lead to 
corruption.
   
   Once the dump is complete and the files are copied, the lock is released, 
and the next thread can proceed safely.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to