errose28 commented on code in PR #7083:
URL: https://github.com/apache/ozone/pull/7083#discussion_r1737627332


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java:
##########
@@ -133,42 +167,47 @@ private Lock getWriteLock(long containerID) {
     return fileLock.get(containerID).writeLock();
   }
 
-  private ContainerProtos.ContainerChecksumInfo read(ContainerData data) 
throws IOException {
+  private Optional<ContainerProtos.ContainerChecksumInfo.Builder> 
read(ContainerData data) throws IOException {
     long containerID = data.getContainerID();
+    File checksumFile = getContainerChecksumFile(data);
     Lock readLock = getReadLock(containerID);
     readLock.lock();
     try {
-      File checksumFile = getContainerChecksumFile(data);
-      // If the checksum file has not been created yet, return an empty 
instance.
-      // Since all writes happen as part of an atomic read-modify-write cycle 
that requires a write lock, two empty
-      // instances for the same container obtained only under the read lock 
will not conflict.
       if (!checksumFile.exists()) {
-        LOG.debug("No checksum file currently exists for container {} at the 
path {}. Returning an empty instance.",
-            containerID, checksumFile);
-        return ContainerProtos.ContainerChecksumInfo.newBuilder()
-            .setContainerID(containerID)
-            .build();
+        LOG.debug("No checksum file currently exists for container {} at the 
path {}", containerID, checksumFile);
+        return Optional.empty();
       }
       try (FileInputStream inStream = new FileInputStream(checksumFile)) {
         return captureLatencyNs(metrics.getReadContainerMerkleTreeLatencyNS(),
-            () -> ContainerProtos.ContainerChecksumInfo.parseFrom(inStream));
+            () -> 
Optional.of(ContainerProtos.ContainerChecksumInfo.parseFrom(inStream).toBuilder()));
       }
     } catch (IOException ex) {
       metrics.incrementMerkleTreeReadFailures();
       throw new IOException("Error occurred when reading container merkle tree 
for containerID "
-              + data.getContainerID(), ex);
+              + data.getContainerID() + " at path " + checksumFile, ex);
     } finally {
       readLock.unlock();
     }
   }
 
   private void write(ContainerData data, ContainerProtos.ContainerChecksumInfo 
checksumInfo) throws IOException {
+    // Make sure callers filled in required fields before writing.
+    Preconditions.assertTrue(checksumInfo.hasContainerID());
+
+    File checksumFile = getContainerChecksumFile(data);
+    File tmpChecksumFile = getTmpContainerChecksumFile(data);
+
     Lock writeLock = getWriteLock(data.getContainerID());
     writeLock.lock();
-    try (FileOutputStream outStream = new 
FileOutputStream(getContainerChecksumFile(data))) {
-      captureLatencyNs(metrics.getWriteContainerMerkleTreeLatencyNS(),
-          () -> checksumInfo.writeTo(outStream));
+    try (FileOutputStream tmpOutputStream = new 
FileOutputStream(tmpChecksumFile)) {
+      // Write to a tmp file and rename it into place.
+      captureLatencyNs(metrics.getWriteContainerMerkleTreeLatencyNS(), () -> {
+        checksumInfo.writeTo(tmpOutputStream);
+        Files.move(tmpChecksumFile.toPath(), checksumFile.toPath(), 
REPLACE_EXISTING);

Review Comment:
   Ah yes I see that in the Javadoc. However there is also this caveat:
   > If the target file exists then it is implementation specific if the 
existing file is replaced or this method fails by throwing an 
[IOException](https://docs.oracle.com/javase/8/docs/api/java/io/IOException.html).
   
   The tests seem to be working and I would imagine the overwrite would work on 
most systems, but I'm not sure if we need to add a fallback? The existing 
rename usages I found in Ozone like 
[here](https://github.com/apache/ozone/blob/80ffcc0db4848e522a31269c33eac1dee174bfa8/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/TarContainerPacker.java#L102)
 and 
[here](https://github.com/apache/ozone/blob/73e4e5b3bc9a9c45bafb63ff9aae6209c3120624/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/security/RootCARotationHandlerImpl.java#L129)
 are actually handling potential conflicts with the target first.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to