Hello Jan, Thanks for the explanation, but i saw this in my log.
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: corosync [TOTEM ] Process pause detected for 577 ms, flushing membership messages. corosync [TOTEM ] Process pause detected for 538 ms, flushing membership messages. corosync [TOTEM ] A processor failed, forming new configuration. corosync [CLM ] CLM CONFIGURATION CHANGE corosync [CLM ] New Configuration: corosync [CLM ] r(0) ip(10.xxx.xxx.xxx) corosync [CLM ] Members Left: corosync [CLM ] r(0) ip(10.xxx.xxx.xxx) corosync [CLM ] Members Joined: corosync [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 6904: memb=1, new=0, lost=1 corosync [pcmk ] info: pcmk_peer_update: memb: node01 891257354 corosync [pcmk ] info: pcmk_peer_update: lost: node02 874480 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: when this happen, corosync needs to retransmit the toten? from what i understood the toten need to be retransmit, but in my case a new configuration was formed This my corosync version corosync-1.3.3-0.3.1 Thanks 2014-04-30 9:42 GMT+02:00 Jan Friesse <jfrie...@redhat.com>: > Emmanuel, > there is no need to trigger fencing on "Process pause detected...". > > Also fencing is not triggered if membership didn't changed. So let's say > token was lost but during gather state all nodes replied, then there is > no change of membership and no need to fence. > > I believe your situation was: > - one node is little overloaded > - token lost > - overload over > - gather state > - every node is alive > -> no fencing > > Regards, > Honza > > emmanuel segura napsal(a): > > Hello Jan, > > > > Forget the last mail: > > > > Hello Jan, > > > > I found this problem in two hp blade system and the strange thing is the > > fencing was not triggered :(, but it's enabled > > > > > > 2014-04-25 18:36 GMT+02:00 emmanuel segura <emi2f...@gmail.com>: > > > >> Hello Jan, > >> > >> I found this problem in two hp blade system and the strange thing is the > >> fencing was triggered :( > >> > >> > >> 2014-04-25 9:27 GMT+02:00 Jan Friesse <jfrie...@redhat.com>: > >> > >> Emanuel, > >>> > >>> emmanuel segura napsal(a): > >>> > >>> Hello List, > >>>> > >>>> I have this two lines in my cluster logs, somebody can help to know > what > >>>> this means. > >>>> > >>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>> :::::::::::::: > >>>> > >>>> corosync [TOTEM ] Process pause detected for 577 ms, flushing > membership > >>>> messages. > >>>> corosync [TOTEM ] Process pause detected for 538 ms, flushing > membership > >>>> messages. > >>>> > >>> > >>> Corosync internally checks gap between member join messages. If such > gap > >>> is > token/2, it means, that corosync was not scheduled to run by > kernel > >>> for too long, and it should discard membership messages. > >>> > >>> Original intend was to detect paused process. If pause is detected, > it's > >>> better to discard old membership messages and initiate new query then > >>> sending outdated view. > >>> > >>> So there are various reasons why this is triggered, but today it's > >>> usually VM with overloaded host machine. > >>> > >>> > >>> > >>> corosync [TOTEM ] A processor failed, forming new configuration. > >>>> > >>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>> :::::::::::::: > >>>> > >>>> I know the "corosync [TOTEM ] A processor failed, forming new > >>>> configuration" message is when the toten package is definitely lost. > >>>> > >>>> Thanks > >>>> > >>>> > >>> Regards, > >>> Honza > >>> > >>> > >>>> > >>>> _______________________________________________ > >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>> > >>>> Project Home: http://www.clusterlabs.org > >>>> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>> Bugs: http://bugs.clusterlabs.org > >>>> > >>>> > >>> > >>> _______________________________________________ > >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>> > >>> Project Home: http://www.clusterlabs.org > >>> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>> Bugs: http://bugs.clusterlabs.org > >>> > >> > >> > >> > >> -- > >> esta es mi vida e me la vivo hasta que dios quiera > >> > > > > > > > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org