Hi all,

I tried to run corosync again on NetBSD 6 BETA. The build errors are
the same, and also it does nothing than utilizing 100% CPU. Luckily,
gdb works in live mode with threads now, and shows this:

(gdb) info th
  Id   Target Id         Frame
  5    LWP 1             0x00007f7ff68071e0 in ?? () from
/usr/lib/libpthread.so.1
  4    LWP 2             0x00007f7ff6476e5a in ___lwp_park50 () from
/usr/lib/libc.so.12
  3    LWP 3             0x00007f7ff643907a in poll () from /usr/lib/libc.so.12
  2    LWP 4             0x00007f7ff6476e5a in ___lwp_park50 () from
/usr/lib/libc.so.12
* 1    LWP 0             0x00007f7ff6476e5a in ___lwp_park50 () from
/usr/lib/libc.so.12
(gdb) thr 5
[Switching to thread 5 (LWP 1)]
#0  0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1
(gdb) bt
#0  0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1
#1  0x00007f7ff68075e8 in ?? () from /usr/lib/libpthread.so.1
#2  0x0000000000409947 in corosync_timer_add_duration
(nanosec_duration=1500000000, data=0x0, timer_fn=0x4049b0
<corosync_totem_stats_updater>,
    handle=0x615518) at timer.c:221
#3  0x000000000040575c in corosync_totem_stats_init () at main.c:820
#4  main_service_ready () at main.c:1410
#5  0x00007f7ff781788b in main_iface_change_fn
(context=0x7f7ff7b3c000, iface_addr=<optimized out>, iface_no=0) at
totemsrp.c:4454
#6  0x00007f7ff7809473 in timer_function_netif_check_timeout
(data=0x7f7ff7384000) at totemudp.c:1388
#7  0x00007f7ff7807780 in timerlist_expire (timerlist=0x7f7ff7b1b0d8)
at tlist.h:309
#8  poll_run (handle=150346236434579456) at coropoll.c:526
#9  0x000000000040775a in main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at main.c:1846

As you can see, it jumps somewhere in libpthread from the
corosync_timer_add_duration() function, resulting in an infinite loop.
As a hack, I just commented everything out:

int corosync_timer_add_duration (
        unsigned long long nanosec_duration,
        void *data,
        void (*timer_fn) (void *data),
        timer_handle *handle)
{
/*
        int res;
        int unlock;

        if (pthread_equal (pthread_self(), expiry_thread) != 0) {
                unlock = 0;
        } else {
                unlock = 1;
                pthread_mutex_lock (&timer_mutex);
        }

        res = timerlist_add_duration (
                &timers_timerlist,
                timer_fn,
                data,
                nanosec_duration,
                handle);

        if (unlock) {
                pthread_mutex_unlock (&timer_mutex);
        }

        pthread_kill (expiry_thread, SIGUSR1);

        return (res);
*/

 return 0;
}

Doing this, the corosync service successfully starts and interacts
with its control tools.

Does anybody have an idea what could be wrong with the code obove?

Stephan

Reply via email to