Hi Ryan/Steven,

As of corosync-1.2.8/openais-1.1.4, there seems to be a race in the
lck.c cleanup code. I am simply trying to run openais/test/testlck
against a 1-node cluster, and upon testlck exit corosync segfaults as
shown below. It appears that by the time this code is reached,
req_exec_lck_resourceclose->source.conn is already deallocated/released
and contains garbage.

This was not happening in corosync-1.1.2/openais-1.1.0. Looks like there
was a patch last year around this area:
http://marc.info/?l=openais&m=124707755231826&w=2, not sure if it
triggered this behavior.

Commenting out the cleanup code in
message_handler_req_exec_lck_resourceclose solves the issue, but of
course will cause resource leak.

Could you please give me some pointers as to how to debug this further?

Also, I've noticed that a patch recommended for FreeBSD
(http://marc.info/?l=openais&m=128922243926782&w=2) should be definitely
used for Linux as the client trips on this assert from time to time
(albeit considerably less frequently than the above issue, which happens
90% of the time).

Thanks in advance,
KM

-------------------------------------------------8<---------------------
---------------------------
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f5962627700 (LWP 3046)]
lck_resource_cleanup_find (message=0x7fff74145aa0, nodeid=<value
optimized out>) at lck.c:1603
1603                    if (mar_name_match (resource_name,
&cleanup->resource_name)) {
(gdb) bt
#0  lck_resource_cleanup_find (message=0x7fff74145aa0, nodeid=<value
optimized out>) at lck.c:1603
#1  message_handler_req_exec_lck_resourceclose (message=0x7fff74145aa0,
nodeid=<value optimized out>) at lck.c:2309
#2  0x00000000004073a0 in deliver_fn (nodeid=1, msg=0x7fff74145aa0,
msg_len=<value optimized out>, endian_conversion_required=0)
    at main.c:771
#3  0x00007f5962a555ef in app_deliver_fn (nodeid=1, msg=<value optimized
out>, msg_len=<value optimized out>,
    endian_conversion_required=0) at totempg.c:506
#4  0x00007f5962a55b73 in totempg_deliver_fn (nodeid=1, msg=0x1f43a12,
msg_len=0, endian_conversion_required=0) at totempg.c:618
#5  0x00007f5962a4d94f in messages_deliver_to_app
(instance=0x7f59607a4010, skip=0, end_point=<value optimized out>)
    at totemsrp.c:3701
#6  0x00007f5962a53954 in message_handler_orf_token (instance=<value
optimized out>, msg=<value optimized out>,
    msg_len=<value optimized out>, endian_conversion_needed=<value
optimized out>) at totemsrp.c:3575
#7  0x00007f5962a49b83 in rrp_deliver_fn (context=0x1efe070,
msg=0x1f2347c, msg_len=71) at totemrrp.c:1393
#8  0x00007f5962a48a76 in net_deliver_fn (handle=<value optimized out>,
fd=<value optimized out>, revents=<value optimized out>,
    data=0x1f22db0) at totemudp.c:1244
#9  0x00007f5962a447f2 in poll_run (handle=6344401509261770752) at
coropoll.c:510
#10 0x0000000000406add in main (argc=<value optimized out>, argv=<value
optimized out>, envp=<value optimized out>) at main.c:1680
(gdb) print resource_name
$1 = (const mar_name_t *) 0x7fff74145ac0
(gdb) print *resource_name
$2 = {length = 19, value = "test_resource_async", '\000' <repeats 236
times>}
(gdb) list
1598                 cleanup_list != &lck_pd->resource_cleanup_list;
1599                 cleanup_list = cleanup_list->next)
1600            {
1601                    cleanup = list_entry (cleanup_list, struct
resource_cleanup, cleanup_list);
1602
1603                    if (mar_name_match (resource_name,
&cleanup->resource_name)) {
1604                            return (cleanup);
1605                    }
1606            }
1607            return (0);
(gdb) print cleanup_list
$4 = (struct list_head *) 0x6f74206465646461
(gdb) print lck_pd
$5 = (struct lck_pd *) 0x1f3dfb0
(gdb) print *lck_pd
$6 = {resource_list = {next = 0x206465747361636d, prev =
0x206567617373656d}, resource_cleanup_list = {next = 0x6f74206465646461,
    prev = 0x676e69646e657020}}
_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to