bharatviswa504 edited a comment on pull request #1659: URL: https://github.com/apache/ozone/pull/1659#issuecomment-740095543
> @bharatviswa504 Thanks for working on this! The changes look good to me. > What is the total retry duration for a om client request? I mean what is the retry policy used right now. The cache duration should be >= time for which a request can be retried. We have 15 retries currently, and if same om is contacted again after failure it doubles the duration. And the double of the time is not for the first round of failovers until we find leader. And for any other errors, we don't have sleep between retry. So basically first 3 to find leader OM no time wait. And next, we have detected leader then if it fails over to a new OM, we don't wait. So at best case it would be less than a minute (as for any n/w error we immediately failover). And if we continuously get LeaderNotReady, we try to contact same OM after first round, then it will be (0+2+4+6+8+ .... +22) = 132sec duration, this is the worst time. If OM has taken so long to become Ready after failover and finally in last iteration it became ready. So, in the worst case, we wait for 132 seconds + (Few extra millisec/sec for the first 3 retries) Thank You @hanishakoneru for correcting me in the calculation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
