[GitHub] [ozone] fapifta commented on a change in pull request #2767: Hdds 5491: EC: Write should handle node failures.

GitBox Wed, 03 Nov 2021 07:04:05 -0700


fapifta commented on a change in pull request #2767:
URL: https://github.com/apache/ozone/pull/2767#discussion_r741971589




##########
File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -239,7 +293,15 @@ private void handleParityWrites(int parityCellSize,
     // TODO: we should alter the put block calls to share CRC to each stream.
     ECBlockOutputStreamEntry streamEntry =
         blockOutputStreamEntryPool.getCurrentStreamEntry();
+    // Since writes are async, let's check the failures once.
+    if(streamEntry.checkStreamFailures()){

Review comment:
       Yes, that's what I meant as well, but I realized I forgot my point here 
:D I wanted to suggest to note this behaviour in the comment above the if or in 
checkStreamFailures API doc ;)

##########
File path: 
hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -420,6 +420,80 @@ public void testPartialStripeWithPartialLastChunk()
     }
   }
 
+  @Test
+  public void testWriteShouldFailIfMoreThanParityNodesFail()
+      throws IOException {
+    testNodeFailuresWhileWriting(3, 3);
+  }
+
+  @Test
+  public void testWriteShouldSuccessIfLessThanParityNodesFail()
+      throws IOException {
+    testNodeFailuresWhileWriting(1, 2);
+  }
+
+  @Test
+  public void testWriteShouldSuccessIf4NodesFailed()
+      throws IOException {
+    testNodeFailuresWhileWriting(4, 1);
+  }
+
+  @Test
+  public void testWriteShouldSuccessIfAllNodesFailed()
+      throws IOException {
+    testNodeFailuresWhileWriting(4, 1);
+  }
+
+  public void testNodeFailuresWhileWriting(int numFailureToInject,
+      int numChunksToWriteAfterFailure) throws IOException {
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    try (OzoneOutputStream out = bucket.createKey(keyName, 1024 * 3,
+        new ECReplicationConfig(3, 2, ECReplicationConfig.EcCodec.RS,
+            chunkSize), new HashMap<>())) {
+      for (int i = 0; i < dataBlocks; i++) {
+        out.write(inputChunks[i]);
+      }
+
+      List<DatanodeDetails> failedDNs = new ArrayList<>();
+      Map<DatanodeDetails, MockDatanodeStorage> storages =
+          ((MockXceiverClientFactory) factoryStub).getStorages();
+      DatanodeDetails[] dnDetails =
+          storages.keySet().toArray(new DatanodeDetails[storages.size()]);
+      for (int i = 0; i < numFailureToInject; i++) {
+        failedDNs.add(dnDetails[i]);
+      }
+
+      // First let's set storage as bad
+      ((MockXceiverClientFactory) factoryStub).setFailedStorages(failedDNs);
+
+      for (int i = 0; i < numChunksToWriteAfterFailure; i++) {
+        out.write(inputChunks[i]);
+      }
+    }
+    final OzoneKeyDetails key = bucket.getKey(keyName);
+    Assert.assertEquals(2, key.getOzoneKeyLocations().size());

Review comment:
       Understood. Won't this be a problem if we write more chunks via this 
general purpose method, and the amount of data written spans into multiple 
blockggroups already, and the failures injected are also adding more block 
groups?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] fapifta commented on a change in pull request #2767: Hdds 5491: EC: Write should handle node failures.

Reply via email to