forgot to copy list. Honza, lgtm.
Regards -steve On Thu, Jan 15, 2015 at 5:20 AM, Steven Dake <[email protected]> wrote: > Honza, > > lgtm. > > regards > -steve > > On Wed, Jan 14, 2015 at 10:19 AM, Jan Friesse <[email protected]> wrote: > >> Jason, >> patch looks good. This touches very delicate part of protocol, so I >> would really like to see also another reviewer comment. Chrissie, Steve? >> >> Regards, >> Honza >> >> >> jason napsal(a): >> > In active rrp mode, commit tokens are treated as mcast data messages, >> > thus, rrp directly delivers them to srp layer by active_mcast_recv(). >> > This will result in duplicated commit tokens being received by srp >> > from different heartbeat links. If node is in recovery state and has >> > already sent out the initial orf token, those duplicated commit tokens >> > will cause message_handler_memb_commit_token() to send initial orf >> > token again! This is wrong because it resets the orf token content in >> > instance->orf_token_retransmit, which breaks the token retransmission >> > state. >> > >> > Furthermore, by sending those initial orf tokens again and again, it >> > may lead active_token_recv() to drop some subsequent orf tokens. It is >> > OK for rrp because srp will do token retransmission, but as said >> > above, srp retransmission state has already been broken, so finally we >> > meet a "token lost in recovery state" condition caused by software. If >> > token timeout value is large, then it will takes long time to create a >> > new ring. >> > >> > This can be reproduced by having two noded set to active rrp mode, >> > with two heartbeat links. Then with one node always on, let the other >> > one do stop/start again and again. It has a low probability to >> > reproduce. In theory, I think, the more heartbeat links used, the more >> > easily it can be reproduced. >> > >> > This problem can be resolved by letting >> > message_handler_memb_commit_token() to ignore duplicated commit tokens >> > in recovery state if node (the ring representation) has already sent >> > out the initial orf token. >> > >> > Different from prev take, this version do not depends on stored token >> > data but uses originated_orf_token in totemsrp_instance to remember if >> > initial orf token has been already originated for current membership. >> > >> > >> > >> > >> > _______________________________________________ >> > discuss mailing list >> > [email protected] >> > http://lists.corosync.org/mailman/listinfo/discuss >> > >> >> _______________________________________________ >> discuss mailing list >> [email protected] >> http://lists.corosync.org/mailman/listinfo/discuss >> > >
_______________________________________________ discuss mailing list [email protected] http://lists.corosync.org/mailman/listinfo/discuss
