On 06/22/2010 11:22 PM, Alain.Moulle wrote: > Hi, > With whatever release (i.e. currently with corosync-1.2.1-2.el6.x86_64), > I always have trouble with the stop of corosync. And each > time it failed when there were some failed actions reported > by crm_mon. > Regards > Alain
Please give 1.2.5 a try. I am not familiar with crm warnings triggering shutdown failures, but I can't make corosync+pacmekaer lockup during startup/shutdown for 2k iterations on single cpu or multi cpu. Regards -steve >> On 06/22/2010 03:56 AM, Vadym Chepkov wrote: >>> > Hi, >>> > >>> > I decided to check if I can start using corosync again on several of >>> > my clusters (have to use heartbeat there at the moment). >>> > I don't even have any services defined in corosync.conf, commented >>> > pacemaker out, just plain corosync and it never goes down: >>> > >>> > # ps axf|grep corosync >>> > 26294 pts/0 S+ 0:00 | \_ /bin/sh /sbin/service >>> > corosync restart >>> > 26299 pts/0 S+ 0:01 | \_ /bin/bash >>> > /etc/init.d/corosync restart >>> > 29249 pts/1 S+ 0:00 \_ grep corosync >>> > 25959 ? Ssl 0:00 corosync >>> > >>> > >>> > I attached to the process and this is where it hangs: >>> > >>> > (gdb) where >>> > #0 0x0fe14134 in poll () from /lib/libc.so.6 >>> > #1 0x0ffbc530 in poll_run (handle=150346236434579456) at coropoll.c:413 >>> > #2 0x10006e50 in main (argc=<value optimized out>, argv=<value >>> > optimized out>) at main.c:1576 >>> > >>> > How can I help to debug this problem? >>> > It is 100% reproducible. >>> > >>> > Thank you, >>> > Vadym >>> > ________ >> >> Vadym, >> >> Thanks for the feedback. I do test this scenario and it works for me: >> >> [r...@cast flatiron]# service corosync start >> Starting Corosync Cluster Engine (corosync): [ OK ] >> [r...@cast flatiron]# service corosync restart >> Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ] >> Waiting for corosync services to unload:. [ OK ] >> Starting Corosync Cluster Engine (corosync): [ OK ] >> [r...@cast flatiron]# service corosync stop >> Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ] >> Waiting for corosync services to unload:. [ OK ] >> [r...@cast flatiron]# service corosync start >> Starting Corosync Cluster Engine (corosync): [ OK ] >> [r...@cast flatiron]# /etc/init.d/corosync restart >> Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ] >> Waiting for corosync services to unload:. [ OK ] >> Starting Corosync Cluster Engine (corosync): [ OK ] >> >> >> One thing that would stop corosync from shutting down is if it couldn't >> enter operational state. This often happens because of a firewall >> enabled on the ports corosync uses to communicate. >> >> The system logs would be helpful (with debug: on). >> >> Regards >> -steve > > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
