Hello,
I'm still stress-testing Corosync/Openais (trunk) on FreeBSD, and I've found
out a tiny bug:
On peer A, my test program calls:
- saClmInitialize()
- saClmClusterTrack(SA_TRACK_CURRENT | SA_TRACK_CHANGES)
- saClmFinalize()
- (does various tests with CPG ..)
Next, when I shut down/kill Corosync on peer B, Corosync on peer A segfaults.
When my test program calls saClmClusterTrackStop() before saClmFinalize,
Corosync doesn't crash on peer B. From that and the stacktrace (joined below),
I guess it tries to signal the change in the cluster to a program that is not
connected anymore (-> missing disconnection notification to CLM ?). I also
guess it means that Corosync will segfault if the client itself crashes.
By the way, it's *not* due to a BSD-ism ;) (I've also tested on a small Debian
cluster).
The core of the crashed Corosync gives me the following stacktrace:
-----
(gdb) bt
#0 0x28501120 in library_notification_send
(cluster_notification_entries=0x3fbd36a0, notify_count=2) at clm.c:429
#1 0x285014ca in lib_notification_leave (nodes=0x3fbf76f0, nodes_entries=2) at
clm.c:524
#2 0x2850189c in clm_confchg_fn
(configuration_type=TOTEM_CONFIGURATION_TRANSITIONAL, member_list=0x3fbf8a14,
member_list_entries=1, left_list=0x3fbf7e14, left_list_entries=2,
joined_list=0x0, joined_list_entries=0, ring_id=0x2833765c) at clm.c:584
#3 0x0804ba7b in confchg_fn
(configuration_type=TOTEM_CONFIGURATION_TRANSITIONAL, member_list=0x3fbf8a14,
member_list_entries=1, left_list=0x3fbf7e14, left_list_entries=2,
joined_list=0x0,
joined_list_entries=0, ring_id=0x2833765c) at main.c:324
#4 0x280a2a2f in app_confchg_fn
(configuration_type=TOTEM_CONFIGURATION_TRANSITIONAL, member_list=0x3fbf8a14,
member_list_entries=1, left_list=0x3fbf7e14, left_list_entries=2,
joined_list=0x0, joined_list_entries=0, ring_id=0x2833765c) at totempg.c:350
#5 0x280a2932 in totempg_confchg_fn
(configuration_type=TOTEM_CONFIGURATION_TRANSITIONAL, member_list=0x3fbf8a14,
member_list_entries=1, left_list=0x3fbf7e14, left_list_entries=2,
joined_list=0x0, joined_list_entries=0, ring_id=0x2833765c) at totempg.c:524
#6 0x280a2343 in totemmrp_confchg_fn
(configuration_type=TOTEM_CONFIGURATION_TRANSITIONAL, member_list=0x3fbf8a14,
member_list_entries=1, left_list=0x3fbf7e14, left_list_entries=2,
joined_list=0x0, joined_list_entries=0, ring_id=0x2833765c) at
totemmrp.c:109
#7 0x2809ad3f in memb_state_operational_enter (instance=0x28316000) at
totemsrp.c:1678
#8 0x2809f890 in message_handler_orf_token (instance=0x28316000,
msg=0x28381638, msg_len=70, endian_conversion_needed=0) at totemsrp.c:3484
#9 0x280a20bb in main_deliver_fn (context=0x28316000, msg=0x28381638,
msg_len=70) at totemsrp.c:4212
#10 0x28095bb2 in none_token_recv (rrp_instance=0x282fe400, iface_no=0,
context=0x28316000, msg=0x28381638, msg_len=70, token_seq=3) at totemrrp.c:536
#11 0x28097849 in rrp_deliver_fn (context=0x28206190, msg=0x28381638,
msg_len=70) at totemrrp.c:1393
#12 0x28093cf5 in net_deliver_fn (handle=7749363892505018368, fd=7, revents=1,
data=0x28381000) at totemudp.c:1223
#13 0x28091d44 in poll_run (handle=7749363892505018368) at coropoll.c:394
#14 0x0804d432 in main (argc=2, argv=0x3fbfece4) at main.c:1069
(gdb) print *cluster_notification_entries
$1 = {cluster_node = {node_id = 2, node_address = {length = 11, family =
MAR_CLM_AF_INET, value = "172.16.10.2", '\0' <repeats 52 times>}, node_name =
{length = 11,
value = "172.16.10.2", '\0' <repeats 244 times>}, member = 0,
boot_timestamp = 1254723307000000000, initial_view_number = 617},
cluster_change = MAR_NODE_LEFT}
(gdb) print clm_pd
$2 = (struct clm_pd *) 0xc3fbbe18
(gdb) print *clm_pd
Cannot access memory at address 0xc3fbbe18
(gdb) info threads
* 3 Thread 0x28201040 (LWP 100188) 0x28501120 in library_notification_send
(cluster_notification_entries=0x3fbd36a0, notify_count=2) at clm.c:429
2 Thread 0x28201150 (LWP 100284) 0x2815fe3f in poll () at poll.S:2
1 Thread 0x282019d0 (LWP 100286) 0x2811f0fb in semop () at semop.S:2
-----
Hope it helps.
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais