sadanand48 commented on code in PR #10479:
URL: https://github.com/apache/ozone/pull/10479#discussion_r3440776671


##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/StreamBlockInputStream.java:
##########
@@ -327,10 +338,14 @@ synchronized void readBlockImpl(long length) throws 
IOException {
   }
 
   private void handleExceptions(IOException cause) throws IOException {
-    if (cause instanceof StorageContainerException || 
isConnectivityIssue(cause)) {
-      if (shouldRetryRead(cause, retryPolicy, retries++)) {
+    IOException root = unwrapCause(cause);
+    if (root instanceof StorageContainerException || isConnectivityIssue(root) 
||
+         root instanceof TimeoutIOException) {
+      if (shouldRetryRead(root, retryPolicy, retries++)) {
+        recordFailedStreamingDatanode();

Review Comment:
   initStreamRead() binds one DN for a long-lived gRPC stream, and mid-read 
failures (e.g. TimeoutIOException when the active DN is stopped) are handled by 
closing the stream and re-initializing.
   
   On re-init, initStreamRead() can select the same DN again . The excluded set 
records the DN that was actively serving the stream when the error happened, so 
the retry opens a new stream on a different replica and resumes from 
requestedLength = position.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to