Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
David Chinner wrote: > Can you turn on slab debug and poisoning and see where > the kernel fails with that? e.g. set: > > CONFIG_DEBUG_SLAB=y > CONFIG_DEBUG_SLAB_LEAK=y I was a little worried about letting those servers in such a bad state, and went the "easy" way. I did upgrade from drbd 0.7.X to latest svn 8.0.X Laurent PS: Should this bug reappear, i'll change the kernel's config, and let you know the result. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
David Chinner wrote: Can you turn on slab debug and poisoning and see where the kernel fails with that? e.g. set: CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SLAB_LEAK=y I was a little worried about letting those servers in such a bad state, and went the easy way. I did upgrade from drbd 0.7.X to latest svn 8.0.X Laurent PS: Should this bug reappear, i'll change the kernel's config, and let you know the result. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
On Thu, Oct 04, 2007 at 09:29:40AM +0200, Laurent Caron wrote: > > Hi, > > I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch, > ), and latest svn (3062) 0.7.X drbd. > > After just 2 days of uptime, I did experience another crash. > > I wonder if it is an XFS related bug, a DRBD one, or related to XFS on top of > DRBD. > > This bug seems to occur with intensive IO operations. > > What do you think about it ? This still looks like memory corruption of some sort:. I'd suspect DRBD at this point because nobody is repprting this against other block devices in 2.6.21 > Oct 3 18:55:23 kernel: Oops: 0002 [#1] > Oct 3 18:55:23 kernel: SMP > Oct 3 18:55:23 kernel: CPU:7 > Oct 3 18:55:23 kernel: EIP:0060:[]Not tainted VLI > Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1) > Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0 Can you turn on slab debug and poisoning and see where the kernel fails with that? e.g. set: CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SLAB_LEAK=y Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
On Thu, Oct 04, 2007 at 09:29:40AM +0200, Laurent Caron wrote: Hi, I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch, ), and latest svn (3062) 0.7.X drbd. After just 2 days of uptime, I did experience another crash. I wonder if it is an XFS related bug, a DRBD one, or related to XFS on top of DRBD. This bug seems to occur with intensive IO operations. What do you think about it ? This still looks like memory corruption of some sort:. I'd suspect DRBD at this point because nobody is repprting this against other block devices in 2.6.21 Oct 3 18:55:23 kernel: Oops: 0002 [#1] Oct 3 18:55:23 kernel: SMP Oct 3 18:55:23 kernel: CPU:7 Oct 3 18:55:23 kernel: EIP:0060:[c016540c]Not tainted VLI Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1) Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0 Can you turn on slab debug and poisoning and see where the kernel fails with that? e.g. set: CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SLAB_LEAK=y Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Crash on 2.6.21.7 Vanilla + DRBD 0.7
Hi, I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch, ), and latest svn (3062) 0.7.X drbd. After just 2 days of uptime, I did experience another crash. I wonder if it is an XFS related bug, a DRBD one, or related to XFS on top of DRBD. This bug seems to occur with intensive IO operations. What do you think about it ? Thanks Laurent Oct 3 18:55:23 kernel: Oops: 0002 [#1] Oct 3 18:55:23 kernel: SMP Oct 3 18:55:23 kernel: CPU:7 Oct 3 18:55:23 kernel: EIP:0060:[]Not tainted VLI Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1) Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0 Oct 3 18:55:23 kernel: eax: f79c2940 ebx: 0015 ecx: 0005 edx: 65b567b0 Oct 3 18:55:23 kernel: esi: 000a edi: d5d26000 ebp: f79d03c0 esp: d2531c98 Oct 3 18:55:23 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Oct 3 18:55:23 kernel: Process rsync (pid: 22409, ti=d253 task=da1e8070 task.ti=d253) Oct 3 18:55:23 kernel: Stack: 0010 02d0 ce9ca0b8 02d0 f79cfe00 f79d1c00 f79c2940 Oct 3 18:55:23 kernel: 0001 d2531cd4 ce9ca088 c022aade d5d2601c 0282 f79cfe00 02d0 Oct 3 18:55:23 kernel: f79cfe00 c01652e6 0001 c0265a4e 0011 d2531d60 d7acfb40 Oct 3 18:55:23 kernel: Call Trace: Oct 3 18:55:23 kernel: [] xfs_da_brelse+0x6e/0xb0 Oct 3 18:55:23 kernel: [] kmem_cache_alloc+0x46/0x50 Oct 3 18:55:23 kernel: [] kmem_zone_alloc+0x4e/0xc0 Oct 3 18:55:23 kernel: [] xfs_fs_alloc_inode+0xf/0x20 Oct 3 18:55:23 kernel: [] alloc_inode+0x16/0x170 Oct 3 18:55:23 kernel: [] iget_locked+0x59/0x130 Oct 3 18:55:23 kernel: [] xfs_iget+0x78/0x160 Oct 3 18:55:23 kernel: [] xfs_acl_vget+0x6c/0x160 Oct 3 18:55:23 kernel: [] xfs_dir_lookup_int+0x93/0xf0 Oct 3 18:55:23 kernel: [] xfs_lookup+0x75/0xa0 Oct 3 18:55:23 kernel: [] xfs_vn_lookup+0x52/0x90 Oct 3 18:55:23 kernel: [] do_lookup+0x148/0x190 Oct 3 18:55:23 kernel: [] __link_path_walk+0x814/0xe40 Oct 3 18:55:23 kernel: [] link_path_walk+0x45/0xc0 Oct 3 18:55:23 kernel: [] do_path_lookup+0x81/0x1c0 Oct 3 18:55:23 kernel: [] getname+0xb3/0xe0 Oct 3 18:55:23 kernel: [] __user_walk_fd+0x3b/0x60 Oct 3 18:55:23 kernel: [] vfs_lstat_fd+0x1f/0x50 Oct 3 18:55:23 kernel: [] sys_lstat64+0xf/0x30 Oct 3 18:55:23 kernel: [] sysenter_past_esp+0x5d/0x81 Oct 3 18:55:23 kernel: === Oct 3 18:55:23 kernel: Code: 10 8b 77 14 01 c2 8b 44 24 30 8b 34 b0 89 77 14 89 54 8d 14 8d 51 01 89 55 00 8b 44 24 10 8b 77 10 3b 70 5c 72 c0 8b 17 8b 47 04 <89> 42 04 89 10 83 7f 14 ff c7 07 00 01 10 00 c7 47 04 00 02 20 Oct 3 18:55:23 kernel: EIP: [] cache_alloc_refill+0x11c/0x4f0 SS:ESP 0068:d2531c98 Oct 3 18:55:26 kernel: Oops: 0002 [#2] Oct 3 18:55:26 kernel: SMP Oct 3 18:55:26 kernel: CPU:7 Oct 3 18:55:26 kernel: EIP:0060:[]Not tainted VLI Oct 3 18:55:26 kernel: EFLAGS: 00210282 (2.6.21-dl380-g5-20071001 #1) Oct 3 18:55:26 kernel: EIP is at alloc_inode+0x20/0x170 Oct 3 18:55:26 kernel: eax: b4fd89ba ebx: b4fd89ba ecx: b4fd89ba edx: b4fd89ba Oct 3 18:55:26 kernel: esi: f29bb000 edi: f29bb000 ebp: ca743575 esp: d6747c64 Oct 3 18:55:26 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Oct 3 18:55:26 kernel: Process imapd (pid: 20054, ti=d6746000 task=e04a20b0 task.ti=d6746000) Oct 3 18:55:26 kernel: Stack: c76fe0dc f29bb000 c017bd89 c04abda0 ca743575 Oct 3 18:55:26 kernel: ca743575 f53b5800 c023fa38 cb2b4524 1b2595f3 0020 f0dd7400 ded8b7a8 Oct 3 18:55:26 kernel: f53b5800 c04abda0 cb2b4524 cb2b4524 ca743575 0004 Oct 3 18:55:26 kernel: Call Trace: Oct 3 18:55:26 kernel: [] iget_locked+0x59/0x130 Oct 3 18:55:26 kernel: [] xfs_iget+0x78/0x160 Oct 3 18:55:26 kernel: [] xfs_trans_iget+0x117/0x190 Oct 3 18:55:26 kernel: [] xfs_ialloc+0xc7/0x570 Oct 3 18:55:26 kernel: [] xlog_grant_push_ail+0x3c/0x150 Oct 3 18:55:26 kernel: [] xfs_dir_ialloc+0x81/0x2d0 Oct 3 18:55:26 kernel: [] xfs_trans_reserve+0xab/0x230 Oct 3 18:55:26 kernel: [] xfs_create+0x395/0x6a0 Oct 3 18:55:26 kernel: [] xfs_iunlock+0x85/0xa0 Oct 3 18:55:26 kernel: [] xfs_vn_mknod+0x235/0x360 Oct 3 18:55:26 kernel: [] vfs_create+0xdd/0x140 Oct 3 18:55:26 kernel: [] open_namei+0x58e/0x5f0 Oct 3 18:55:26 kernel: [] do_filp_open+0x2e/0x60 Oct 3 18:55:26 kernel: [] get_unused_fd+0x4f/0xb0 Oct 3 18:55:26 kernel: [] do_sys_open+0x4a/0xe0 Oct 3 18:55:26 kernel: [] sys_open+0x1c/0x20 Oct 3 18:55:26 kernel: [] sysenter_past_esp+0x5d/0x81 Oct 3 18:55:26 kernel: === Oct 3 18:55:26 kernel: Code: 90 90 90 90 90 90 90 90 90 90 90 57 56 89 c6 53 8b 40 20 8b 10 85 d2 0f 84 1e 01 00 00 89 f0 ff d2 89 c3 85 db 0f 84 ee 00 00 00 <89> b3 98 00 00 00 b9 02 00 00 00 0f b6 46 10 8d bb f8 00 00 00 Oct 3 18:55:26 kernel: EIP: [] alloc_inode+0x20/0x170 SS:ESP
Crash on 2.6.21.7 Vanilla + DRBD 0.7
Hi, I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch, ), and latest svn (3062) 0.7.X drbd. After just 2 days of uptime, I did experience another crash. I wonder if it is an XFS related bug, a DRBD one, or related to XFS on top of DRBD. This bug seems to occur with intensive IO operations. What do you think about it ? Thanks Laurent Oct 3 18:55:23 kernel: Oops: 0002 [#1] Oct 3 18:55:23 kernel: SMP Oct 3 18:55:23 kernel: CPU:7 Oct 3 18:55:23 kernel: EIP:0060:[c016540c]Not tainted VLI Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1) Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0 Oct 3 18:55:23 kernel: eax: f79c2940 ebx: 0015 ecx: 0005 edx: 65b567b0 Oct 3 18:55:23 kernel: esi: 000a edi: d5d26000 ebp: f79d03c0 esp: d2531c98 Oct 3 18:55:23 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Oct 3 18:55:23 kernel: Process rsync (pid: 22409, ti=d253 task=da1e8070 task.ti=d253) Oct 3 18:55:23 kernel: Stack: 0010 02d0 ce9ca0b8 02d0 f79cfe00 f79d1c00 f79c2940 Oct 3 18:55:23 kernel: 0001 d2531cd4 ce9ca088 c022aade d5d2601c 0282 f79cfe00 02d0 Oct 3 18:55:23 kernel: f79cfe00 c01652e6 0001 c0265a4e 0011 d2531d60 d7acfb40 Oct 3 18:55:23 kernel: Call Trace: Oct 3 18:55:23 kernel: [c022aade] xfs_da_brelse+0x6e/0xb0 Oct 3 18:55:23 kernel: [c01652e6] kmem_cache_alloc+0x46/0x50 Oct 3 18:55:23 kernel: [c0265a4e] kmem_zone_alloc+0x4e/0xc0 Oct 3 18:55:23 kernel: [c027015f] xfs_fs_alloc_inode+0xf/0x20 Oct 3 18:55:23 kernel: [c017bbd6] alloc_inode+0x16/0x170 Oct 3 18:55:23 kernel: [c017bd89] iget_locked+0x59/0x130 Oct 3 18:55:23 kernel: [c023fa38] xfs_iget+0x78/0x160 Oct 3 18:55:23 kernel: [c020a49c] xfs_acl_vget+0x6c/0x160 Oct 3 18:55:23 kernel: [c025b143] xfs_dir_lookup_int+0x93/0xf0 Oct 3 18:55:23 kernel: [c025ea55] xfs_lookup+0x75/0xa0 Oct 3 18:55:23 kernel: [c026d0c2] xfs_vn_lookup+0x52/0x90 Oct 3 18:55:23 kernel: [c016fd08] do_lookup+0x148/0x190 Oct 3 18:55:23 kernel: [c0171cb4] __link_path_walk+0x814/0xe40 Oct 3 18:55:23 kernel: [c0172325] link_path_walk+0x45/0xc0 Oct 3 18:55:23 kernel: [c0172581] do_path_lookup+0x81/0x1c0 Oct 3 18:55:23 kernel: [c01712c3] getname+0xb3/0xe0 Oct 3 18:55:23 kernel: [c0172f8b] __user_walk_fd+0x3b/0x60 Oct 3 18:55:23 kernel: [c016bcdf] vfs_lstat_fd+0x1f/0x50 Oct 3 18:55:23 kernel: [c016bd5f] sys_lstat64+0xf/0x30 Oct 3 18:55:23 kernel: [c01040b0] sysenter_past_esp+0x5d/0x81 Oct 3 18:55:23 kernel: === Oct 3 18:55:23 kernel: Code: 10 8b 77 14 01 c2 8b 44 24 30 8b 34 b0 89 77 14 89 54 8d 14 8d 51 01 89 55 00 8b 44 24 10 8b 77 10 3b 70 5c 72 c0 8b 17 8b 47 04 89 42 04 89 10 83 7f 14 ff c7 07 00 01 10 00 c7 47 04 00 02 20 Oct 3 18:55:23 kernel: EIP: [c016540c] cache_alloc_refill+0x11c/0x4f0 SS:ESP 0068:d2531c98 Oct 3 18:55:26 kernel: Oops: 0002 [#2] Oct 3 18:55:26 kernel: SMP Oct 3 18:55:26 kernel: CPU:7 Oct 3 18:55:26 kernel: EIP:0060:[c017bbe0]Not tainted VLI Oct 3 18:55:26 kernel: EFLAGS: 00210282 (2.6.21-dl380-g5-20071001 #1) Oct 3 18:55:26 kernel: EIP is at alloc_inode+0x20/0x170 Oct 3 18:55:26 kernel: eax: b4fd89ba ebx: b4fd89ba ecx: b4fd89ba edx: b4fd89ba Oct 3 18:55:26 kernel: esi: f29bb000 edi: f29bb000 ebp: ca743575 esp: d6747c64 Oct 3 18:55:26 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Oct 3 18:55:26 kernel: Process imapd (pid: 20054, ti=d6746000 task=e04a20b0 task.ti=d6746000) Oct 3 18:55:26 kernel: Stack: c76fe0dc f29bb000 c017bd89 c04abda0 ca743575 Oct 3 18:55:26 kernel: ca743575 f53b5800 c023fa38 cb2b4524 1b2595f3 0020 f0dd7400 ded8b7a8 Oct 3 18:55:26 kernel: f53b5800 c04abda0 cb2b4524 cb2b4524 ca743575 0004 Oct 3 18:55:26 kernel: Call Trace: Oct 3 18:55:26 kernel: [c017bd89] iget_locked+0x59/0x130 Oct 3 18:55:26 kernel: [c023fa38] xfs_iget+0x78/0x160 Oct 3 18:55:26 kernel: [c025a697] xfs_trans_iget+0x117/0x190 Oct 3 18:55:26 kernel: [c0243d87] xfs_ialloc+0xc7/0x570 Oct 3 18:55:26 kernel: [c024aabc] xlog_grant_push_ail+0x3c/0x150 Oct 3 18:55:26 kernel: [c025b261] xfs_dir_ialloc+0x81/0x2d0 Oct 3 18:55:26 kernel: [c025855b] xfs_trans_reserve+0xab/0x230 Oct 3 18:55:26 kernel: [c0261aa5] xfs_create+0x395/0x6a0 Oct 3 18:55:26 kernel: [c023eac5] xfs_iunlock+0x85/0xa0 Oct 3 18:55:26 kernel: [c026d6b5] xfs_vn_mknod+0x235/0x360 Oct 3 18:55:26 kernel: [c01705cd] vfs_create+0xdd/0x140 Oct 3 18:55:26 kernel: [c01738ae] open_namei+0x58e/0x5f0 Oct 3 18:55:26 kernel: [c016716e] do_filp_open+0x2e/0x60 Oct 3 18:55:26 kernel: [c0166e4f] get_unused_fd+0x4f/0xb0 Oct 3 18:55:26 kernel: [c01671ea] do_sys_open+0x4a/0xe0 Oct 3 18:55:26 kernel: [c01672bc] sys_open+0x1c/0x20 Oct 3 18:55:26 kernel: [c01040b0] sysenter_past_esp+0x5d/0x81 Oct 3 18:55:26 kernel: