$ cat /proc/sys/kernel/panic_on_oops What does this return. If 0, then that is the cause of the problem. It should be 1.
David Murphy wrote: > My logs on Node Id 3: > > > Dec 16 06:44:03 web3 syslogd 1.5.0#1ubuntu1: restart. > Dec 16 08:43:31 web3 kernel: [10727560.835261] Modules linked in: vmmemctl > ocfs2 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs vmhgfs ext2 > dm_round_robin crc32c libcrc32c iscsi_tcp libiscsi scsi_transport_iscsi lp > loop ipv6 parport_pc parport psmouse evdev serio_raw pcspkr i2c_piix4 > i2c_core container ac button intel_agp agpgart dm_multipath dm_mod ext3 jbd > mbcache sr_mod cdrom sg sd_mod ata_piix pata_acpi floppy pcnet32 ata_generic > mii mptspi mptscsih mptbase scsi_transport_spi libata scsi_mod thermal > processor fan vmxnet vesafb fbcon tileblit font bitblit softcursor > Dec 16 08:43:31 web3 kernel: [10727560.843108] > Dec 16 08:43:31 web3 kernel: [10727560.843900] Pid: 4856, comm: o2net Not > tainted (2.6.24-19-virtual #1) > Dec 16 08:43:31 web3 kernel: [10727560.844724] EIP: 0062:[<f8e682bb>] > EFLAGS: 00010202 CPU: 0 > Dec 16 08:43:31 web3 kernel: [10727560.845566] EIP is at > __dlm_print_one_lock_resource+0x9db/0x9f0 [ocfs2_dlm] > Dec 16 08:43:31 web3 kernel: [10727560.846385] EAX: 00000001 EBX: 0000001f > ECX: 00000000 EDX: 00000000 > Dec 16 08:43:31 web3 kernel: [10727560.849779] ESI: f75e8c00 EDI: 00000000 > EBP: ec774700 ESP: df877d34 > Dec 16 08:43:31 web3 kernel: [10727560.851900] DS: 007b ES: 007b FS: 00d8 > GS: 0000 SS: 006a > Dec 16 08:43:31 web3 kernel: [10727560.906502] ---[ end trace > 989a5ffd1351fea4 ]--- > Dec 16 08:44:01 web3 kernel: [10727590.622434] o2net: connection to node > deploy (num 5) at 192.168.102.12:7777 has been idle for 30.0 seconds, > shutting it down. > Dec 16 08:44:01 web3 kernel: [10727590.627319] (4,0):o2net_idle_timer:1414 > here are some times that might help debug the situation: (tmr > 1229438611.731225 now 1229438641.727360 dr 1229438613.731191 adv > 1229438611.731227:1229438611.731228 func (a9b6ebe7:504) > 1229438600.868142:1229438600.868149) > Dec 16 08:44:01 web3 kernel: [10727590.629281] o2net: connection to node > app1 (num 6) at 192.168.102.10:7777 has been idle for 30.0 seconds, shutting > it down. > Dec 16 08:44:01 web3 kernel: [10727590.630630] (4,0):o2net_idle_timer:1414 > here are some times that might help debug the situation: (tmr > 1229438611.731486 now 1229438641.734226 dr 1229438634.811356 adv > 1229438611.731488:1229438611.731489 func (a9b6ebe7:502) > 1229438610.482837:1229438610.482839) > Dec 16 08:44:01 web3 kernel: [10727590.632818] o2net: connection to node > rgapp1 (num 4) at 192.168.102.11:7777 has been idle for 30.0 seconds, > shutting it down. > Dec 16 08:44:01 web3 kernel: [10727590.634937] (4,0):o2net_idle_timer:1414 > here are some times that might help debug the situation: (tmr > 1229438611.736146 now 1229438641.737771 dr 1229438613.756472 adv > 1229438611.736149:1229438611.736149 func (a9b6ebe7:503) > 1229438611.735983:1229438611.735988) > Dec 16 08:44:01 web3 kernel: [10727590.640618] o2net: connection to node > web1 (num 1) at 192.168.102.40:7777 has been idle for 30.0 seconds, shutting > it down. > Dec 16 08:44:01 web3 kernel: [10727590.642402] (4,0):o2net_idle_timer:1414 > here are some times that might help debug the situation: (tmr > 1229438611.742904 now 1229438641.745604 dr 1229438617.734942 adv > 1229438611.742907:1229438611.742907 func (a9b6ebe7:504) > 1229438611.675070:1229438611.675075) > Dec 16 08:44:01 web3 kernel: [10727590.651745] o2net: connection to node > web2 (num 2) at 192.168.102.41:7777 has been idle for 30.0 seconds, shutting > it down. > Dec 16 08:44:01 web3 kernel: [10727590.657208] (0,0):o2net_idle_timer:1414 > here are some times that might help debug the situation: (tmr > 1229438611.756791 now 1229438641.756770 dr 1229438641.756769 adv > 1229438611.756768:1229438611.756697 func (a9b6ebe7:507) > 1229438611.756792:1229438611.746230) > > > > On the other nodes they ended up locking up waiting for death notification > of Node3. > Can anyone tell me with the kernel message above means and what I can to to > keep this from occurring again > > > Thanks > David > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users