Hi,
We are using DLM for clustering (two nodes, single network interface between
these nodes). Occasionally in intermittent network failure we are getting
following kernel crash.
We are running linux 3.0.101 (x86_64) and DLM version is 1.7.
<3>[ 2410.184636] dlm: closing connection to node 33663168
<3>[ 2421.307880] dlm: reject connect from unknown addr
<4>[ 2421.307884] 02 00 00 00 c0 a8 01 02 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
<0>[ 2421.307951] ------------[ cut here ]------------
<2>[ 2421.307958] kernel BUG at
/usr/src/packages/BUILD/dlm-1.6.fio/obj/default/lowcomms.c:715!
<0>[ 2421.307961] invalid opcode: 0000 [#1] SMP
<4>[ 2421.307963] CPU 14
<4>[ 2421.307964] Modules linked in: ipmi_si scst_vdisk(PFN) raid0 ip6t_REJECT
ip6t_LOG nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT
xt_tcpudp xt_pkttype ipt_LOG xt_limit xt_state iptable_raw ip6table_mangle
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 ip6table_filter ip6_tables drbd(FN) mlx4_en qla2x00tgt(FN)
qla2xxx_scst(FN) scsi_transport_fc scsi_tgt scst(PFN) scst_mods(FN) dlm(FN)
configfs sctp(F) mlx4_core mpt3sas mpt2sas scsi_transport_sas raid_class mptctl
mptbase iptable_filter ip_tables x_tables ipmi_devintf ipmi_msghandler
dell_rbu(X) af_packet mperf binfmt_misc iomemory_vsl4(PFN) dm_mod sr_mod cdrom
tg3 shpchp ipv6 ptp usb_storage ahci sg pps_core joydev pci_hotplug libahci
ipv6_lib acpi_power_meter libata rtc_cmos dcdbas(X) container button wmi usbhid
hid ttm drm_kms_helper drm i2c_algo_bit sysimgblt sysfillrect i2c_core
syscopyarea btrfs zlib_deflate crc32c libcrc32c sd_mod crc_t10dif processor
thermal_sys ehci_hcd hwmon megaraid_sas scsi_dh_alua scsi_dh_emc scsi_dh_rdac
scsi_dh_hp_sw scsi_dh scsi_mod asix usbnet usbcore usb_common mii [last
unloaded: ipmi_si]
<4>[ 2421.308022] Supported: No, Proprietary and Unsupported modules are loaded
<4>[ 2421.308024]
<4>[ 2421.308025] Pid: 2599, comm: kworker/u:8 Tainted: PF NX
3.0.101-0.15.1.6651.0.PTF-default #1 Dell Inc. PowerEdge R730/0599V5
<4>[ 2421.308029] RIP: 0010:[<ffffffffa0712346>] [<ffffffffa0712346>]
receive_from_sock+0x376/0x380 [dlm]
<4>[ 2421.308038] RSP: 0018:ffff882f54067d40 EFLAGS: 00010246
<4>[ 2421.308039] RAX: 0000000000000158 RBX: ffff882ef3cacd98 RCX:
0000000000000000
<4>[ 2421.308041] RDX: 0000000000000158 RSI: ffff882f3797c480 RDI:
ffffffffa06e089d
<4>[ 2421.308043] RBP: ffff882f54067d90 R08: ffff882f54067ae0 R09:
ffff882f54067d50
<4>[ 2421.308044] R10: ffffffffa06f7260 R11: ffffffff8120f180 R12:
0000000000000158
<4>[ 2421.308046] R13: ffff882f54067d50 R14: 0000000000001000 R15:
ffff882ef3cacda8
<4>[ 2421.308048] FS: 0000000000000000(0000) GS:ffff88307f4e0000(0000)
knlGS:0000000000000000
<4>[ 2421.308050] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[ 2421.308053] CR2: 00007f3a7bb473b0 CR3: 0000002df7c6e000 CR4:
00000000001407e0
<4>[ 2421.308055] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
<4>[ 2421.308057] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
<4>[ 2421.308059] Process kworker/u:8 (pid: 2599, threadinfo ffff882f54066000,
task ffff882f55360140)
<0>[ 2421.308060] Stack:
<4>[ 2421.308061] ffff882f54066010 0000000000011800 0000000000000000
0000000000000000
<4>[ 2421.308069] ffff882f54067dc0 0000000000000002 ffff882f54067dc0
0000000000000000
<4>[ 2421.308072] 0000000000000080 ffff882f5dcb4180 0000000000000030
0000000100000084
<0>[ 2421.308075] Call Trace:
<4>[ 2421.308108] [<ffffffffa071052e>] process_recv_sockets+0x1e/0x30 [dlm]
<4>[ 2421.308125] [<ffffffff8107b9cc>] process_one_work+0x16c/0x350
<4>[ 2421.308133] [<ffffffff8107e5fa>] worker_thread+0x17a/0x410
<4>[ 2421.308138] [<ffffffff81082966>] kthread+0x96/0xa0
<4>[ 2421.308143] [<ffffffff81469ee4>] kernel_thread_helper+0x4/0x10
<0>[ 2421.308146] Code: 00 88 ff ff 49 c1 e5 0c 49 8d 74 05 00 31 c0 e8 8b bf
d4 e0 4c 89 ff e8 29 d5 d4 e0 31 f6 48 89 df e8 af e9 ff ff e9 04 ff ff ff <0f>
0b eb fe 66 0f 1f 44 00 00 48 81 ec e8 00 00 00 31 ff be 50
<1>[ 2421.308170] RIP [<ffffffffa0712346>] receive_from_sock+0x376/0x380 [dlm]
<4>[ 2421.308175] RSP <ffff882f54067d40>
The line number lowcomms.c:715 points to the receive_from_sock() function,
where nodeid is being checked (and the received message is not a SCTP event).
644 /* Data received from remote end */
645 static int receive_from_sock(struct connection *con)
646 {
....
....
704
705 /* Process SCTP notifications */
706 if (msg.msg_flags & MSG_NOTIFICATION) {
707 msg.msg_control = incmsg;
708 msg.msg_controllen = sizeof(incmsg);
709
710 process_sctp_notification(con, &msg,
711 page_address(con->rx_page) + con->cb.base);
712 mutex_unlock(&con->sock_mutex);
713 return 0;
714 }
715 BUG_ON(con->nodeid == 0);
I am fairly new when it comes to understanding DLM code. We are using SCTP
protocol. If I understood correctly, nodeid = 0 points to the base connection
(associated with the listener socket). The function receive_from_sock() has an
assumption that if MSG_NOTIFICATION flag is not set, it got to be a peeled
socket (which has associated nodeid > 0). And vice versa - if MSG_NOTIFICATION
flag is set, it is listener socket with nodeid = 0.
But when process_sctp_notification() rejects a SCTP event message due addr to
nodeid mismatch (ie. dlm_addr-to_nodeid function returns non-zero), the
function returns without peeling off a new socket.
The code is shown below, where the function is returning from line number 579.
And the socket is peeled off at line number 588.
As the socket peeling off is not done, it is possible for listener socket
receiving ordinary data (which was meant for peeled socket) from the connection
where client already send some data (I am assuming client already sent this
data before the socket is shutdown at server end). And if listener socket
receives ordinary data, DLM is going to hit the "BUG_ON()" at lowcomms.c:715.
Please let me know if my analysis is correct.
517 static void process_sctp_notification(struct connection *con,
518 struct msghdr *msg, char *buf)
....
....
570 make_sockaddr(&prim.ssp_addr, 0, &addr_len);
571 if (dlm_addr_to_nodeid(&prim.ssp_addr, &nodeid)) {
572 int i;
573 unsigned char *b=(unsigned char
*)&prim.ssp_addr;
574 log_print("reject connect from unknown
addr");
575 for (i=0; i<sizeof(struct
sockaddr_storage);i++)
576 printk("%02x ", b[i]);
577 printk("\n");
578 sctp_send_shutdown(prim.ssp_assoc_id);
579 return;
580 }
581
582 new_con = nodeid2con(nodeid, GFP_NOFS);
583 if (!new_con)
584 return;
585
586 /* Peel off a new sock */
587 sctp_lock_sock(con->sock->sk);
588 ret = sctp_do_peeloff(con->sock->sk,
589 sn->sn_assoc_change.sac_assoc_id,
590 &new_con->sock);
591 sctp_release_sock(con->sock->sk);
________________________________
PLEASE NOTE: The information contained in this electronic mail message is
intended only for the use of the designated recipient(s) named above. If the
reader of this message is not the intended recipient, you are hereby notified
that you have received this message in error and that any review,
dissemination, distribution, or copying of this message is strictly prohibited.
If you have received this communication in error, please notify the sender by
telephone or e-mail (as shown above) immediately and destroy any and all copies
of this message in your possession (whether hard copies or electronically
stored copies).