On Mon, Jun 28, 2010 at 2:20 PM, Keisuke MORI <[email protected]> wrote:
> I've upgrade to pacemaker-1.0.9.1 / corosync-1.2.5 from clusterlabs on
> CentOS 5.5 using yum but it still hangs on its startup somtimes.
>
> The symptom is exactly same as this:
> https://lists.linux-foundation.org/pipermail/openais/2010-June/014854.html
Arrgghhh!!!
Can you try the following patch?
Index: exec/main.c
===================================================================
--- exec/main.c.orig 2010-06-21 18:59:32.000000000 +0200
+++ exec/main.c 2010-06-29 08:29:48.834736539 +0200
@@ -425,20 +425,9 @@ static void corosync_tty_detach (void)
/*
* Map stdin/out/err to /dev/null.
*/
- fd = open("/dev/null", O_RDWR);
- if (fd >= 0) {
- /* dup2 to 0 / 1 / 2 (stdin / stdout / stderr) */
- close (STDIN_FILENO);
- close (STDOUT_FILENO);
- close (STDERR_FILENO);
- dup2(fd, STDIN_FILENO); /* 0 */
- dup2(fd, STDOUT_FILENO); /* 1 */
- dup2(fd, STDERR_FILENO); /* 2 */
-
- /* Should be 0, but just in case it isn't... */
- if (fd > 2)
- close(fd);
- }
+ freopen("/dev/null", "r", stdin);
+ freopen("/dev/null", "a", stderr);
+ freopen("/dev/null", "a", stdout);
}
static void corosync_mlockall (void)
>
> Any hints what should I look into more?
> I have the core with me (taken by gcore) so if you want to look into
> it then I can show you.
>
> Here is the backtrace.
>
> ----8<--------8<--------8<--------8<--------8<--------8<--------8<----
> [r...@pm01 ~]# gdb /usr/sbin/corosync core.2596
> (...)
> (gdb) where
> #0 0x000000377a607b35 in pthread_join () from /lib64/libpthread.so.0
> #1 0x00002b12ea5528d9 in logsys_atexit () at logsys.c:1642
> #2 0x0000000000405a85 in sigsegv_handler (num=<value optimized out>)
> at main.c:222
> #3 <signal handler called>
> #4 0x0000003779a9a2fa in fork () from /lib64/libc.so.6
> #5 0x00002aaaaaba84de in spawn_child () from
> /usr/libexec/lcrso/pacemaker.lcrso
> #6 0x00002aaaaabacb9b in pcmk_startup () from
> /usr/libexec/lcrso/pacemaker.lcrso
> #7 0x00000000004082c9 in corosync_service_link_and_init
> (corosync_api=0x613900, service_name=0x1f76c850 "pacemaker",
> service_ver=0) at service.c:201
> #8 0x0000000000408673 in corosync_service_defaults_link_and_init
> (corosync_api=0x613900) at service.c:534
> #9 0x0000000000405086 in main_service_ready () at main.c:1224
> #10 0x00002b12ea332425 in main_iface_change_fn
> (context=0x2aaaaaaae010, iface_addr=<value optimized out>,
> iface_no=<value optimized out>) at totemsrp.c:4363
> #11 0x00002b12ea3291a7 in timer_function_netif_check_timeout
> (data=0x1f793520) at totemudp.c:1380
> #12 0x00002b12ea326459 in timerlist_expire (handle=150346236434579456)
> at tlist.h:309
> #13 poll_run (handle=150346236434579456) at coropoll.c:448
> #14 0x0000000000406693 in main (argc=<value optimized out>,
> argv=<value optimized out>) at main.c:1576
> (gdb) q
> [r...@pm01 ~]# rpm -qa | grep corosync
> corosync-1.2.5-1.3.el5
> corosynclib-1.2.5-1.3.el5
> corosync-debuginfo-1.2.5-1.3.el5
> [r...@pm01 ~]# rpm -qa | grep pacemaker
> drbd-pacemaker-8.3.8-1
> pacemaker-libs-1.0.9.1-1.el5
> pacemaker-1.0.9.1-1.el5
> [r...@pm01 ~]# pgrep -lf corosync
> 2589 corosync
> 2596 corosync
> [r...@pm01 ~]# pgrep -lf heartbeat
> 2595 /usr/lib64/heartbeat/stonithd
> 2597 /usr/lib64/heartbeat/lrmd
> 2599 /usr/lib64/heartbeat/pengine
> 3473 /usr/lib64/heartbeat/crmd
>
> ----8<--------8<--------8<--------8<--------8<--------8<--------8<----
>
> Regards,
> --
> Keisuke MORI
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais