Hi all, I have a three node cluster which run KVM guests a services. The system run fine for some months but the suddenly it started to have soft lockups as you can se below and the nodes get fenced. The guests use clvm with raw lv as back end, and the config files are on shared gfs2 file systems. Any idea which could be the cause ? A attache also my cluster.conf
Any idea is welcome Regards Chris Pid: 136556, comm: corosync Not tainted 2.6.32-279.el6.x86_64 #1 HP ProLiant DL980 G7 RIP: 0010:[<ffffffff8104d08e>] [<ffffffff8104d08e>] wait_for_rqlock+0x2e/0x40 RSP: 0018:ffff881c12231ee8 EFLAGS: 00000206 RAX: 00000000e52ae4c7 RBX: ffff881c12231ee8 RCX: ffff882070e16680 RDX: 00000000e52ae4c7 RSI: ffff882070e11960 RDI: 0000000000000000 RBP: ffffffff8100bc0e R08: 0000000000000000 R09: dead000000200200 R10: ffff881c125830c0 R11: 00000000000000d2 R12: 0000000000000282 R13: ffffffff81aa5700 R14: ffff882070e11960 R15: ffff881c12583438 FS: 0000000000000000(0000) GS:ffff882070e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000035a489a490 CR3: 0000000001a85000 CR4: 00000000000026e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process corosync (pid: 136556, threadinfo ffff881c12230000, task ffff881c12582aa0) Stack: ffff881c12231f68 ffffffff8107091b ffff881c12231f78 ffff881c12231f28 <d> ffff881faf1d5660 ffff881c12582f68 ffff881c12582f68 0000000000000000 <d> ffff881c12231f28 ffff881c12231f28 ffff881c12231f78 00007f9ce339d440 Call Trace: [<ffffffff8107091b>] ? do_exit+0x5ab/0x870 [<ffffffff81070ce7>] ? sys_exit+0x17/0x20 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b Code: e5 0f 1f 44 00 00 48 c7 c0 80 66 01 00 65 48 8b 0c 25 b0 e0 00 00 0f ae f0 48 01 c1 eb 09 0f 1f 80 00 00 00 00 f3 90 8b 01 89 c2 <c1> fa 10 66 39 c2 75 f2 c9 c3 0f 1f 84 00 00 00 00 00 55 48 89 Call Trace: [<ffffffff8107091b>] ? do_exit+0x5ab/0x870 [<ffffffff81070ce7>] ? sys_exit+0x17/0x20 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b BUG: soft lockup - CPU#90 stuck for 67s! [multipathd:141345] Modules linked in: iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables gfs2 dlm configfs autofs4 sunrpc bridge bonding 8021q garp stp llc ipv6 ext2 vhost_net macvtap macvlan tun kvm_intel kvm microcode serio_raw power_meter be2net bnx2 netxen_nic iTCO_wdt iTCO_vendor_support hpilo hpwdt sg i7core_edac edac_core shpchp ext4 mbcache jbd2 dm_round_robin sr_mod cdrom sd_mod crc_t10dif lpfc scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
cluster.conf
Description: Binary data
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster