[ 
https://issues.apache.org/jira/browse/HDDS-9551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791745#comment-17791745
 ] 

Dave Teng edited comment on HDDS-9551 at 11/30/23 7:15 PM:
-----------------------------------------------------------

hm, one thing I'd like to mention that the error-handling on client side is 
more complicated than just "DN fails, then client's retry +1".

The retry count is increased by 1 in specific case that, whenever client side's 
write fails but server side exception is not processed by handleException 
[https://github.com/apache/ozone/blob/master/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyOutputStream.java#L306]
(otherwise, after each handleException, retry count is reset as '0' 
[https://github.com/apache/ozone/blob/master/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyOutputStream.java#L378)]

This specific case seems quite tricky, and I haven't fully understand why it 
was designed like this.
But as far as I know, server side error as long as processed by method 
handleException, the retry count wouldn't increase by 1. So far I haven't 
reproduced successfully a scenario where server's(om/scm) exception is not 
handled by that handleException on client side. Basically, that retry count on 
client side doesn't increase if it just simply SCM throws error, since it seems 
that handleException catches most of error, the rest of error which would cause 
retry count to change, only I could reproduce throwing exception in client side 
in some specific place in code.


was (Author: JIRAUSER292649):
hm, one thing I'd like to mention that the error-handling on client side is 
more complicated than just "DN fails, then client's retry +1".

The retry count is increased by 1 in specific case that, whenever client side's 
write fails but server side exception is not processed by handleException 
[https://github.com/apache/ozone/blob/master/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyOutputStream.java#L306]
(otherwise, after each handleException, retry count is reset as '0' 
[https://github.com/apache/ozone/blob/master/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyOutputStream.java#L378)]

This specific case seems quite tricky, and I haven't fully understand why it 
was designed like this.
But as far as I know, server side error as long as processed by method 
handleException, the retry count wouldn't increase by 1, and this is all the 
case I observed. Means that, so far I haven't reproduced successfully a 
scenario where server's(om/scm) exception is not handled by that 
handleException on client side. Basically, that retry count on client side 
doesn't increase if it just simply SCM throws error, since it seems that 
handleException catches most of error, the rest of error which would cause 
retry count to change, only I could reproduce throwing exception in client side 
in some specific place in code.

> Allow the client write to fall back to nodes in the exclude list if that is 
> all that is available
> -------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-9551
>                 URL: https://issues.apache.org/jira/browse/HDDS-9551
>             Project: Apache Ozone
>          Issue Type: Task
>          Components: Ozone Client
>            Reporter: Dave Teng
>            Assignee: Dave Teng
>            Priority: Major
>              Labels: pull-request-available
>
> Allow the client write to fall back to nodes in the exclude list if that is 
> all that is available



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to