Hi Camille,

Thanks for the reply. Could you please go through the following cases:

Case-1:-
Say, I have 5 servers zk1,zk2,zk3,zk4,zk5 and configured sessionTimeOut=60secs
readTimeOut = 60 * 2 / 3 and is 40secs
connectionTimeOut = 60/servers.length = 60/5 = 12secs
 
step1: Say, client has established connection with zk1. 
step2: Shutdown zk1 and zk2. Since readTimeOut is 40s, will take 40s for first 
retrayal to the next server.
step3: Say, client retry's to zk2, will take max 12s for connectionTimeOut. Now 
client session has elapsed total 52secs, only left out time for session 
expiration is 8secs. 

Retryal intervals as follows >>  40s, 12s, 8s

Case-2:-
Also consider 'R-O server' feature, started 5 servers with R-O mode and 
configured sessionTimeOut=60secs
step1: Client has established connection with zk1. 
step2: Shutdown zk1,zk2,zk3
step3: Client has elapsed 40s for readTimeOut, then 12s, then 8s and the client 
session will be expired before the next retryal. But Zk4 and ZK5 are running in 
R-O mode and able to retain the client session.

Say, if we consider 'servers.length' so can improve the retryals or shall we 
think of a better formula?
(Note:- Evenafter considering server.length, I feel still there is a small gap, 
not retrying to the fifth server)

readTimeOut = 60 * 2 / 5(servers.len) and is 24secs.
Retryal intervals as follows >> 24s, 12s, 12s, 12s

IMO, presently 'readTimeOut' is not calculated based on the quorum strength, 
but it would be good to have a shorter timeout for more fair retryals.

Thanks,
Rakesh


________________________________________
From: Camille Fournier [[email protected]]
Sent: Tuesday, January 03, 2012 5:51 AM
To: [email protected]
Subject: Re: zookeeper client retry logic..

It's an interesting idea... can you explain more why you think it
would be good to have a shorter timeout in the case of a longer list
of servers?

Thanks,
C

On Mon, Jan 2, 2012 at 2:08 AM, Rakesh R <[email protected]> wrote:
> Hi everyone,
>
>
>
> In ClientCnxn, 'readTimeOut' is calculated as follows:
>
>    readTimeOut = sessionTimeOut * 2 / 3; // here it is not considering the 
> server list. If the server list grows more than 3, it will not giving a fair 
> chance to retry to all the servers(in worst case).
>
>
>
> Can we think of changing the 'readTimeOut logic' by using the 
> serverslist.length instead of constant/magic number '3'.
>
>
>
> For example:-
>
> I have 5 servers and client sessionTimeOut=120secs
>
>
>
> readTimeOut = 120 * 2 / 3 and is 80secs
>
>
>
> In this case, the it takes 80secs for the first timeout if the connected 
> server is not responding. This is large time, if we consdier the serverlist, 
> it can retry to next server immediately in <50secs.
>
>
>
>
>
> Thanks & Regards,
>
> Rakesh
>
>
>
>

Reply via email to