ptlrs commented on code in PR #9947:
URL: https://github.com/apache/ozone/pull/9947#discussion_r2961969674


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/HddsVolume.java:
##########
@@ -326,17 +327,30 @@ public VolumeCheckResult checkDbHealth(File dbFile) 
throws InterruptedException
       return VolumeCheckResult.HEALTHY;
     }
 
+    // We attempt to open RocksDb twice to ignore any transient errors
+    // and to confirm that we actually cannot open RocksDb in readonly mode.
     final boolean isVolumeTestResultHealthy = true;
-    try (ManagedOptions managedOptions = new ManagedOptions();
-         ManagedRocksDB ignored = ManagedRocksDB.openReadOnly(managedOptions, 
dbFile.toString())) {
-      volumeTestResultQueue.add(isVolumeTestResultHealthy);
-    } catch (Exception e) {
-      if (Thread.currentThread().isInterrupted()) {
-        throw new InterruptedException("Check of database for volume " + this 
+ " interrupted.");
+    final int maxAttempts = 2;

Review Comment:
   My thought process for not making this configurable was to avoid problematic 
configurations where the total attempts could exceed the disk check timeout. 
   
   For example if maxAttempts is 10 and you sleep for 1 minute in each attempt, 
firstly this is no longer about transient errors and the total time of 10 
minutes will make the method flaky and fail the disk check timeout of 10 
minutes. 
   
   I can change it if we think we may still want that control in production.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to