When the panic happened ,3 nodes crashed. And the load balance director can't ping the 3 nodes.
The network is GE.The network didn't drop packets,We use the system for serveral months. As you analysis, Maybe there is a bug, Can you fix it? Thanks for your help. The /var/log/messages on another node is posted below: Dec 24 01:23:56 redhat4 kernel: Unable to handle kernel paging request at virtual address 4092ed17 Dec 24 01:23:56 redhat4 kernel: printing eip: Dec 24 01:23:56 redhat4 kernel: f89eb32d Dec 24 01:23:56 redhat4 kernel: *pde = 00000000 Dec 24 01:23:56 redhat4 kernel: Oops: 0002 [#1] Dec 24 01:23:56 redhat4 kernel: SMP Dec 24 01:23:56 redhat4 kernel: Modules linked in: pvfs2(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc dm_mirror dm_multipath dm_mod button battery ac uhci_hcd e1000 floppy ext3 jbd qla2300 qla2xxx scsi_transport_fc mptscsih mptbase sd_mod scsi_mod Dec 24 01:23:56 redhat4 kernel: CPU: 0 Dec 24 01:23:56 redhat4 kernel: EIP: 0060:[<f89eb32d>] Tainted: GF VLI Dec 24 01:23:56 redhat4 kernel: EFLAGS: 00010246 (2.6.9-22.ELsmp) Dec 24 01:23:56 redhat4 kernel: EIP is at pvfs2_devreq_read+0x1d6/0x363 [pvfs2] Dec 24 01:23:56 redhat4 kernel: eax: 4092ed17 ebx: f744bcc0 ecx: f011431c edx: f6824280 Dec 24 01:23:56 redhat4 kernel: esi: f0116874 edi: f744bcd0 ebp: fffffe50 esp: f5b34f20 Dec 24 01:23:56 redhat4 kernel: ds: 007b es: 007b ss: 0068 Dec 24 01:23:56 redhat4 kernel: Process pvfs2-client-co (pid: 4147, threadinfo=f5b34000 task=f7220830) Dec 24 01:23:56 redhat4 kernel: Stack: f0114320 f0116850 f011431c 00000000 00000294 116768e0 f5b34fa0 00000246 Dec 24 01:23:56 redhat4 kernel: f5b34fa0 f89ebe64 f5e4fe80 00000145 c016a866 f5b34f70 f6729200 0000000b Dec 24 01:23:56 redhat4 kernel: f5b34fa0 00000292 f1ee600c 00000246 f1ee6008 00000000 f89fa4c0 f5e4fe80 Dec 24 01:23:56 redhat4 kernel: Call Trace: Dec 24 01:23:56 redhat4 kernel: [<f89ebe64>] pvfs2_devreq_poll+0x47/0x4c [pvfs2] Dec 24 01:23:56 redhat4 kernel: [<c016a866>] do_pollfd+0x54/0x77 Dec 24 01:23:56 redhat4 kernel: [<c0159c61>] vfs_read+0xb6/0xe2 Dec 24 01:23:56 redhat4 kernel: [<c0159e74>] sys_read+0x3c/0x62 Dec 24 01:23:56 redhat4 kernel: [<c02d10cf>] syscall_call+0x7/0xb Dec 24 01:23:56 redhat4 kernel: [<c02d007b>] __lock_text_end+0x11a/0x100f Dec 24 01:23:56 redhat4 kernel: Code: 0c 8d 7b 10 81 c6 58 25 00 00 89 c5 89 f8 e8 30 49 8e c7 8b 03 8d 14 e8 8b 42 04 89 72 04 8b 4c 24 08 89 46 04 89 91 58 25 00 00 <89> 30 89 f8 e8 80 49 8e c7 8b 44 24 04 e8 77 49 8e c7 c7 44 24 Dec 24 01:23:56 redhat4 kernel: <0>Fatal exception: panic in 5 seconds ----- Original Message ----- From: "Rob Ross" <[EMAIL PROTECTED]> To: "jiang yi" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[email protected]> Sent: Tuesday, January 09, 2007 1:51 PM Subject: Re: [Pvfs2-users] Help! PVFS2 cause kernel panic > Hi, > > Were the servers still up and running when this happened? > > What's your network? Have you seen dropped packets etc? > > We're not used to seeing applications that are so persistent in > continuing to perform I/O in the presence of failures, so there may be a > bug here that is being uncovered by this continuous reading in an error > case... > > Thanks for the report! > > Rob > > jiang yi wrote: > > HI Murali, > > I use the RTSP over PVFS2 filesystem(pvfs2-1.5.1). Build a Media server > > system with 4 computer nodes and 2 io nodes. > > But these days the PVFS2 cause 3 computer nodes kernel panic. help me! > > > > This is any of the log in /var/log/messages. > > Help me analyse this and tell me how to fix the problem. Or I will die ^_^ > > Thank you so much!! > > > > > > Aug 29 14:32:46 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Aug 29 14:32:46 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1046588, -- returning -110 > > Aug 29 17:17:53 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Aug 29 17:17:53 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1047656, -- returning -110 > > Aug 29 18:05:36 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Aug 29 18:05:36 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1046827, -- returning -110 > > Aug 29 18:14:03 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Aug 29 18:14:03 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1046590, -- returning -110 > > Aug 29 19:24:03 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Aug 29 19:24:03 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1048431, -- returning -110 > > Aug 29 20:04:40 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Aug 29 20:04:40 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1046679, -- returning -110 > > Sep 14 16:14:55 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 16:14:55 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 16:14:56 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 16:14:56 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 16:15:37 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 16:15:37 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 16:21:53 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 16:21:53 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 16:23:04 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 16:23:04 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1042058, -- returning -110 > > Sep 14 16:27:53 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 16:27:53 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:15:39 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:15:39 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:15:52 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:15:52 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:16:09 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:16:09 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:16:10 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:16:10 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:21:52 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:21:52 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:27:59 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:27:59 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:28:12 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:28:12 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 17:39:26 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:39:26 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039033, -- returning -110 > > Sep 14 17:43:06 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 17:43:06 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1043039, -- returning -110 > > Sep 14 18:08:54 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:08:54 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1040923, -- returning -110 > > Sep 14 18:18:53 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:18:53 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1040923, -- returning -110 > > Sep 14 18:27:07 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:27:07 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1042057, -- returning -110 > > Sep 14 18:33:52 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:33:52 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039041, -- returning -110 > > Sep 14 18:33:53 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:33:53 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 18:34:01 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:34:01 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038496, -- returning -110 > > Sep 14 18:50:06 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 18:50:06 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 19:03:36 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 19:03:36 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038489, -- returning -110 > > Sep 14 20:34:49 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 20:34:49 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 14 20:34:51 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 20:34:51 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 14 20:34:56 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 20:34:56 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 14 21:16:55 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 21:16:55 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 14 21:25:52 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 21:25:52 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039037, -- returning -110 > > Sep 14 21:33:09 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 21:33:09 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 14 21:52:30 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 21:52:30 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1039031, -- returning -110 > > Sep 14 22:17:05 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 22:17:05 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 14 22:17:45 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 14 22:17:45 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038485, -- returning -110 > > Sep 30 14:29:14 redhat2 kernel: pvfs2: pvfs2_lookup -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 30 14:30:18 redhat2 last message repeated 2 times > > Sep 30 14:35:14 redhat2 kernel: pvfs2: pvfs2_lookup -- wait timed out > > and retries exhausted. aborting attempt. > > Sep 30 14:36:18 redhat2 last message repeated 2 times > > Nov 2 22:16:08 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Nov 2 22:16:08 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1038504, -- returning -110 > > Nov 12 14:54:57 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1036658, -- returning -2 > > Nov 12 14:55:00 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1036651, -- returning -2 > > Nov 12 14:57:00 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1036651, -- returning -2 > > Nov 12 14:57:04 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1036658, -- returning -2 > > Nov 12 14:57:42 redhat2 last message repeated 2 times > > Nov 12 15:00:08 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1036651, -- returning -2 > > Nov 12 15:00:08 redhat2 last message repeated 2 times > > Dec 5 12:51:01 redhat2 kernel: printk: 2 messages suppressed. > > Dec 5 12:51:05 redhat2 kernel: printk: 1 messages suppressed. > > Dec 5 12:51:20 redhat2 kernel: printk: 3 messages suppressed. > > Dec 5 12:51:35 redhat2 kernel: printk: 2 messages suppressed. > > Dec 5 12:52:26 redhat2 kernel: printk: 1 messages suppressed. > > Dec 5 12:53:29 redhat2 kernel: printk: 2 messages suppressed. > > Dec 5 12:53:36 redhat2 kernel: printk: 3 messages suppressed. > > Dec 5 12:53:43 redhat2 kernel: printk: 2 messages suppressed. > > Dec 5 12:53:52 redhat2 kernel: printk: 4 messages suppressed. > > Dec 5 12:54:05 redhat2 kernel: printk: 3 messages suppressed. > > Dec 5 12:54:08 redhat2 kernel: printk: 2 messages suppressed. > > Dec 5 12:54:13 redhat2 kernel: printk: 2 messages suppressed. > > Dec 5 12:54:19 redhat2 kernel: printk: 5 messages suppressed. > > Dec 5 12:54:36 redhat2 kernel: printk: 4 messages suppressed. > > Dec 5 12:54:44 redhat2 kernel: printk: 5 messages suppressed. > > Dec 5 12:54:46 redhat2 kernel: printk: 1 messages suppressed. > > Dec 5 12:55:00 redhat2 kernel: printk: 1 messages suppressed. > > Dec 5 12:55:02 redhat2 kernel: printk: 3 messages suppressed. > > Dec 5 12:55:07 redhat2 kernel: printk: 1 messages suppressed. > > Dec 5 12:55:38 redhat2 kernel: printk: 3 messages suppressed. > > Dec 24 16:54:54 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Dec 24 16:54:54 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1042026, -- returning -110 > > Dec 24 17:04:46 redhat2 kernel: pvfs2: pvfs2_file_read -- wait timed out > > and retries exhausted. aborting attempt. > > Dec 24 17:04:46 redhat2 kernel: pvfs2_file_read: error writing to handle > > 1036052, -- returning -110 > > Dec 26 10:49:33 redhat2 sshd(pam_unix)[15164]: authentication failure; > > logname= uid=0 euid=0 tty=ssh ruser= rhost=10.130.222.133 user=root > > Dec 26 10:50:51 redhat2 sshd(pam_unix)[15164]: 1 more authentication > > failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.130.222.133 > > user=root > > Dec 28 14:13:48 redhat2 kernel: printk: 1 messages suppressed. > > Dec 28 14:14:26 redhat2 last message repeated 4 times > > Dec 28 14:14:37 redhat2 kernel: printk: 2 messages suppressed. > > Dec 28 14:14:46 redhat2 kernel: printk: 1 messages suppressed. > > Dec 28 14:14:57 redhat2 kernel: printk: 2 messages suppressed. > > Dec 28 14:42:22 redhat2 kernel: printk: 2 messages suppressed. > > Dec 31 20:36:57 redhat2 kernel: Unable to handle kernel paging request > > at virtual address 67c38e12 > > Dec 31 20:36:57 redhat2 kernel: printing eip: > > Dec 31 20:36:57 redhat2 kernel: f89e363e > > Dec 31 20:36:57 redhat2 kernel: *pde = 00000000 > > Dec 31 20:36:57 redhat2 kernel: Oops: 0000 [#1] > > Dec 31 20:36:57 redhat2 kernel: SMP > > Dec 31 20:36:57 redhat2 kernel: Modules linked in: ip_vs pvfs2(U) md5 > > ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc dm_mirror > > dm_multipath dm_mod button battery ac uhci_hcd e1000 floppy ext3 jbd > > qla2300 qla2xxx scsi_transport_fc mptscsih mptbase sd_mod scsi_mod > > Dec 31 20:36:57 redhat2 kernel: CPU: 2 > > Dec 31 20:36:57 redhat2 kernel: EIP: 0060:[<f89e363e>] Not tainted VLI > > Dec 31 20:36:57 redhat2 kernel: EFLAGS: 00010246 (2.6.9-22.ELsmp) > > Dec 31 20:36:57 redhat2 kernel: EIP is at > > pvfs2_devreq_writev+0x184/0x59b [pvfs2] > > Dec 31 20:36:57 redhat2 kernel: eax: f5d66710 ebx: 67c38e12 ecx: > > 000001fd edx: f6a00000 > > Dec 31 20:36:57 redhat2 kernel: esi: f5d66700 edi: fffffe50 ebp: > > f56cced4 esp: f56cceb8 > > Dec 31 20:36:57 redhat2 kernel: ds: 007b es: 007b ss: 0068 > > Dec 31 20:36:57 redhat2 kernel: Process pvfs2-client-co (pid: 4151, > > threadinfo=f56cc000 task=f6ff00b0) > > Dec 31 20:36:57 redhat2 kernel: Stack: f56cceb4 000022a8 00000004 > > dd4fc42c dd4fc41c 00000004 f56ccf40 80000000 > > Dec 31 20:36:57 redhat2 kernel: 00000000 f765fe80 00000000 > > c02759c6 00000040 00000000 00000000 f56ccf40 > > Dec 31 20:36:57 redhat2 kernel: 00000001 00000000 00000000 > > 00000040 f63ccf48 00000a98 c0275a16 00000246 > > Dec 31 20:36:57 redhat2 kernel: Call Trace: > > Dec 31 20:36:57 redhat2 kernel: [<c02759c6>] sock_readv_writev+0x5e/0x81 > > Dec 31 20:36:57 redhat2 kernel: [<c0275a16>] sock_readv+0x2d/0x34 > > Dec 31 20:36:57 redhat2 kernel: [<f89e34ba>] > > pvfs2_devreq_writev+0x0/0x59b [pvfs2] > > Dec 31 20:36:57 redhat2 kernel: [<c015a177>] do_readv_writev+0x19c/0x21d > > Dec 31 20:36:57 redhat2 kernel: [<c015a276>] vfs_writev+0x3e/0x43 > > Dec 31 20:36:57 redhat2 kernel: [<c015a319>] sys_writev+0x3c/0x62 > > Dec 31 20:36:57 redhat2 kernel: [<c02d10cf>] syscall_call+0x7/0xb > > Dec 31 20:36:57 redhat2 kernel: Code: b9 ff ff ff 5e e9 2e 04 00 00 8b > > 35 c4 5a 9f f8 8d 6c 24 1c 89 e8 8b 56 04 ff 56 0c 89 c7 8d 46 10 e8 0c > > c6 8e c7 8b 16 8b 1c fa <8b> 03 0f 18 00 90 8d 04 fa 39 c3 74 1c 89 da > > 89 e8 ff 56 08 8b > > Dec 31 20:36:57 redhat2 kernel: <0>Fatal exception: panic in 5 seconds > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Pvfs2-users mailing list > > [email protected] > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
