19.02.2014, 09:49, "Andrew Beekhof" <and...@beekhof.net>: > On 19 Feb 2014, at 4:18 pm, Andrey Groshev <gre...@yandex.ru> wrote: > >> 19.02.2014, 09:08, "Andrew Beekhof" <and...@beekhof.net>: >>> On 19 Feb 2014, at 4:00 pm, Andrey Groshev <gre...@yandex.ru> wrote: >>>> 19.02.2014, 06:48, "Andrew Beekhof" <and...@beekhof.net>: >>>>> On 18 Feb 2014, at 11:05 pm, Andrey Groshev <gre...@yandex.ru> wrote: >>>>>> Hi, ALL and Andrew! >>>>>> >>>>>> Today is a good day - I killed a lot, and a lot of shooting at me. >>>>>> In general - I am happy (almost like an elephant) :) >>>>>> Except resources on the node are important to me eight processes: >>>>>> corosync,pacemakerd,cib,stonithd,lrmd,attrd,pengine,crmd. >>>>>> I killed them with different signals (4,6,11 and even 9). >>>>>> Behavior does not depend of number signal - it's good. >>>>>> If STONITH send reboot to the node - it rebooted and rejoined the >>>>>> cluster - too it's good. >>>>>> But the behavior is different from killing various demons. >>>>>> >>>>>> Turned four groups: >>>>>> 1. corosync,cib - STONITH work 100%. >>>>>> Kill via any signals - call STONITH and reboot. >>>>>> >>>>>> 2. lrmd,crmd - strange behavior STONITH. >>>>>> Sometimes called STONITH - and the corresponding reaction. >>>>>> Sometimes restart daemon and restart resources with large delay >>>>>> MS:pgsql. >>>>>> One time after restart crmd - pgsql don't restart. >>>>>> >>>>>> 3. stonithd,attrd,pengine - not need STONITH >>>>>> This daemons simple restart, resources - stay running. >>>>>> >>>>>> 4. pacemakerd - nothing happens. >>>>>> And then I can kill any process of the third group. They do not >>>>>> restart. >>>>>> Generaly don't touch corosync,cib and maybe lrmd,crmd. >>>>>> >>>>>> What do you think about this? >>>>>> The main question of this topic - we decided. >>>>>> But this varied behavior - another big problem. >>>>>> >>>>>> Forgоt logs http://send2me.ru/pcmk-Tue-18-Feb-2014.tar.bz2 >>>>> Which of the various conditions above do the logs cover? >>>> All various in day. >>> Are you trying to torture me? >>> Can you give me a rough idea what happened when? >> No, there is 8 processes on the 4th signal and repeats the experiments with >> unknown outcome :) >> Easier to conduct new experiments and individual new logs . >> Which variant is more interesting? > > The long delay in restarting pgsql. > Everything else seems correct. >
He even don't tried start pgsql. In Logs tree the tests. kill -s4 lrmd pid. 1. STONITH 2. STONITH 3. hangs http://send2me.ru/pcmk-Wed-19-Feb-2014.tar.bz2 > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org