Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-20 Thread Chris Boot

Chris Boot wrote:

I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


OK, with the recompiled kernel this appears to work just fine now. I've 
been pounding the box all day with rsyncs, VMware VMs, plenty of web 
serving (inc. SVN) and so far it's holding up just fine. Cheers for the 
diagnosis.


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-20 Thread Chris Boot

Chris Boot wrote:

I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


OK, with the recompiled kernel this appears to work just fine now. I've 
been pounding the box all day with rsyncs, VMware VMs, plenty of web 
serving (inc. SVN) and so far it's holding up just fine. Cheers for the 
diagnosis.


Many thanks,
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Jan Engelhardt

On Aug 18 2007 17:28, Chris Boot wrote:
>
> I will. This will probably be on Monday now, since the machine isn't
> accepting SysRq requests over the serial console. :-(

Ah yeah, stupid null-modem cables!
You can also trigger sysrq from /proc/sysrq-trigger (well, as long
as the system lives)

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot <[EMAIL PROTECTED]> writes:

  

Måns Rullgård wrote:


Chris Boot <[EMAIL PROTECTED]> writes:


  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev


[...]
  

[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

  

Thanks, that explains a lot. However, I don't have any XFS filesystems
mounted over loop devices on ext3. Earlier in the day I had iso9660 on
loop on xfs, could that have caused the issue? It was unmounted and
deleted when this panic occurred.



The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.
  


I am. The situation was earlier on was iso9660 on loop on xfs on lvm on 
cciss. I guess that might have smashed the stack undetectably and 
induced corruption encountered later on? When I experienced this panic 
the machine would have probably been performing a backup, which was 
simply a load of ext3/xfs filesystems on lvm on the HP cciss controller. 
None of the loop devices would have been mounted.


I have a few machines now with 4k stacks and using lvm + md + xfs and 
have no trouble at all, but none are Red Hat (all Debian) and none use 
cciss either. Maybe it's a deadly combination.



I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Måns Rullgård
Chris Boot <[EMAIL PROTECTED]> writes:

> Måns Rullgård wrote:
>> Chris Boot <[EMAIL PROTECTED]> writes:
>>
>>
>>> All,
>>>
>>> I've got a box running RHEL5 and haven't been impressed by ext3
>>> performance on it (running of a 1.5TB HP MSA20 using the cciss
>>> driver). I compiled XFS as a module and tried it out since I'm used to
>>> using it on Debian, which runs much more efficiently. However, every
>>> so often the kernel panics as below. Apologies for the tainted kernel,
>>> but we run VMware Server on the box as well.
>>>
>>> Does anyone have any hits/tips for using XFS on Red Hat? What's
>>> causing the panic below, and is there a way around this?
>>>
>>> BUG: unable to handle kernel paging request at virtual address b8af9d60
>>> printing eip:
>>> c0415974
>>> *pde = 
>>> Oops:  [#1]
>>> SMP last sysfs file: /block/loop7/dev
[...]
>>> [] xfsbufd_wakeup+0x28/0x49 [xfs]
>>> [] shrink_slab+0x56/0x13c
>>> [] try_to_free_pages+0x162/0x23e
>>> [] __alloc_pages+0x18d/0x27e
>>> [] find_or_create_page+0x53/0x8c
>>> [] __getblk+0x162/0x270
>>> [] do_lookup+0x53/0x157
>>> [] ext3_getblk+0x7c/0x233 [ext3]
>>> [] ext3_getblk+0xeb/0x233 [ext3]
>>> [] mntput_no_expire+0x11/0x6a
>>> [] ext3_bread+0x13/0x69 [ext3]
>>> [] htree_dirblock_to_tree+0x22/0x113 [ext3]
>>> [] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
>>> [] do_path_lookup+0x20e/0x25f
>>> [] get_empty_filp+0x99/0x15e
>>> [] ext3_permission+0x0/0xa [ext3]
>>> [] ext3_readdir+0x1ce/0x59b [ext3]
>>> [] filldir+0x0/0xb9
>>> [] sys_fstat64+0x1e/0x23
>>> [] vfs_readdir+0x63/0x8d
>>> [] filldir+0x0/0xb9
>>> [] sys_getdents+0x5f/0x9c
>>> [] syscall_call+0x7/0xb
>>> ===
>>>
>>
>> Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
>> seems to be enough to overflow it.
>>
> Thanks, that explains a lot. However, I don't have any XFS filesystems
> mounted over loop devices on ext3. Earlier in the day I had iso9660 on
> loop on xfs, could that have caused the issue? It was unmounted and
> deleted when this panic occurred.

The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.

> I'll probably just try and recompile the kernel with 8k stacks and see
> how it goes. Screw the support, we're unlikely to get it anyway. :-P

Please report how this works out.

-- 
Måns Rullgård
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot <[EMAIL PROTECTED]> writes:

  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
CPU:1
EIP:0060:[]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
smp_send_reschedule+0x3/0x53
eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
000f  0001 0001 c200c6e0 0100 
0069 0180 018fc500 c200d240 0003 0292 f601efc0
f6027e00  0050 Call Trace:
[] try_to_wake_up+0x351/0x37b
[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.
  
Thanks, that explains a lot. However, I don't have any XFS filesystems 
mounted over loop devices on ext3. Earlier in the day I had iso9660 on 
loop on xfs, could that have caused the issue? It was unmounted and 
deleted when this panic occurred.


I'll probably just try and recompile the kernel with 8k stacks and see 
how it goes. Screw the support, we're unlikely to get it anyway. :-P


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Jan Engelhardt

On Aug 18 2007 13:31, Måns Rullgård wrote:
>>
>> BUG: unable to handle kernel paging request at virtual address b8af9d60
>> printing eip:
>> c0415974
>> *pde = 
>> Oops:  [#1]
>> SMP last sysfs file: /block/loop7/dev
>> Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
>> autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
>> vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
>> ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
>> i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
>> dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
>> scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
>> uhci_hcd
>> CPU:1
>> EIP:0060:[]Tainted: P  VLI
>> EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
>> smp_send_reschedule+0x3/0x53
>> eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
>> esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
>> ds: 007b   es: 007b   ss: 0068
>> Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
>> Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
>> 000f  0001 0001 c200c6e0 0100 
>> 0069 0180 018fc500 c200d240 0003 0292 f601efc0
>> f6027e00  0050 Call Trace:
>> [] try_to_wake_up+0x351/0x37b
>> [] xfsbufd_wakeup+0x28/0x49 [xfs]
>> [] shrink_slab+0x56/0x13c
[...]
>
>Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
>seems to be enough to overflow it.

I think we should include the vermagic string in oopses too, 
so that the flags SMP, PREEMPT, RT, 4KSTACKS, mod_unload, etc. are shown 
and the situation is a bit more apparent.



Jan
-- 

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Måns Rullgård
Chris Boot <[EMAIL PROTECTED]> writes:

> All,
>
> I've got a box running RHEL5 and haven't been impressed by ext3
> performance on it (running of a 1.5TB HP MSA20 using the cciss
> driver). I compiled XFS as a module and tried it out since I'm used to
> using it on Debian, which runs much more efficiently. However, every
> so often the kernel panics as below. Apologies for the tainted kernel,
> but we run VMware Server on the box as well.
>
> Does anyone have any hits/tips for using XFS on Red Hat? What's
> causing the panic below, and is there a way around this?
>
> BUG: unable to handle kernel paging request at virtual address b8af9d60
> printing eip:
> c0415974
> *pde = 
> Oops:  [#1]
> SMP last sysfs file: /block/loop7/dev
> Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
> autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
> vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
> ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
> i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
> dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
> scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
> uhci_hcd
> CPU:1
> EIP:0060:[]Tainted: P  VLI
> EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
> smp_send_reschedule+0x3/0x53
> eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
> esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
> ds: 007b   es: 007b   ss: 0068
> Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
> Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
> 000f  0001 0001 c200c6e0 0100 
> 0069 0180 018fc500 c200d240 0003 0292 f601efc0
> f6027e00  0050 Call Trace:
> [] try_to_wake_up+0x351/0x37b
> [] xfsbufd_wakeup+0x28/0x49 [xfs]
> [] shrink_slab+0x56/0x13c
> [] try_to_free_pages+0x162/0x23e
> [] __alloc_pages+0x18d/0x27e
> [] find_or_create_page+0x53/0x8c
> [] __getblk+0x162/0x270
> [] do_lookup+0x53/0x157
> [] ext3_getblk+0x7c/0x233 [ext3]
> [] ext3_getblk+0xeb/0x233 [ext3]
> [] mntput_no_expire+0x11/0x6a
> [] ext3_bread+0x13/0x69 [ext3]
> [] htree_dirblock_to_tree+0x22/0x113 [ext3]
> [] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
> [] do_path_lookup+0x20e/0x25f
> [] get_empty_filp+0x99/0x15e
> [] ext3_permission+0x0/0xa [ext3]
> [] ext3_readdir+0x1ce/0x59b [ext3]
> [] filldir+0x0/0xb9
> [] sys_fstat64+0x1e/0x23
> [] vfs_readdir+0x63/0x8d
> [] filldir+0x0/0xb9
> [] sys_getdents+0x5f/0x9c
> [] syscall_call+0x7/0xb
> ===

Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

-- 
Måns Rullgård
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

All,

I've got a box running RHEL5 and haven't been impressed by ext3 
performance on it (running of a 1.5TB HP MSA20 using the cciss driver). 
I compiled XFS as a module and tried it out since I'm used to using it 
on Debian, which runs much more efficiently. However, every so often the 
kernel panics as below. Apologies for the tainted kernel, but we run 
VMware Server on the box as well.


Does anyone have any hits/tips for using XFS on Red Hat? What's causing 
the panic below, and is there a way around this?


Many thanks,
Chris Boot

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP 
last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U) 
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U) 
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi ac 
lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac i2c_i801 
edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport dm_snapshot 
dm_zero dm_mirror dm_mod cciss mptspi mptscsih scsi_transport_spi sd_mod 
scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd

CPU:1
EIP:0060:[]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) 
EIP is at smp_send_reschedule+0x3/0x53

eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500  
000f 
   0001 0001 c200c6e0 0100  0069 
0180 
  018fc500 c200d240 0003 0292 f601efc0 f6027e00  
0050 
Call Trace:

[] try_to_wake_up+0x351/0x37b
[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===
Code: 5d c3 b9 01 00 00 00 31 d2 6a 00 b8 f0 5a 41 c0 e8 2a ff ff ff fa 
e8 52 16 00 00 fb 58 c3 b8 54 3a 66 c0 e9 8e 6b 1e 00 53 89 c3 <0f> a3 
05 60 1f 6d c0 19 c0 85 c0 75 27 e8 bf db 00 00 50 68 55 
EIP: [] smp_send_reschedule+0x3/0x53 SS:ESP 0068:f4f2fc8c

<0>Kernel panic - not syncing: Fatal exception

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

All,

I've got a box running RHEL5 and haven't been impressed by ext3 
performance on it (running of a 1.5TB HP MSA20 using the cciss driver). 
I compiled XFS as a module and tried it out since I'm used to using it 
on Debian, which runs much more efficiently. However, every so often the 
kernel panics as below. Apologies for the tainted kernel, but we run 
VMware Server on the box as well.


Does anyone have any hits/tips for using XFS on Red Hat? What's causing 
the panic below, and is there a way around this?


Many thanks,
Chris Boot

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP 
last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U) 
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U) 
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi ac 
lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac i2c_i801 
edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport dm_snapshot 
dm_zero dm_mirror dm_mod cciss mptspi mptscsih scsi_transport_spi sd_mod 
scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd

CPU:1
EIP:0060:[c0415974]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) 
EIP is at smp_send_reschedule+0x3/0x53

eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500  
000f 
   0001 0001 c200c6e0 0100  0069 
0180 
  018fc500 c200d240 0003 0292 f601efc0 f6027e00  
0050 
Call Trace:

[c041dc23] try_to_wake_up+0x351/0x37b
[f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
[c04572f9] shrink_slab+0x56/0x13c
[c0457c0c] try_to_free_pages+0x162/0x23e
[c0454064] __alloc_pages+0x18d/0x27e
[c045214e] find_or_create_page+0x53/0x8c
[c046c7b1] __getblk+0x162/0x270
[c0475be0] do_lookup+0x53/0x157
[f889138f] ext3_getblk+0x7c/0x233 [ext3]
[f88913fe] ext3_getblk+0xeb/0x233 [ext3]
[c048215c] mntput_no_expire+0x11/0x6a
[f889226e] ext3_bread+0x13/0x69 [ext3]
[f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
[f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[c047828b] do_path_lookup+0x20e/0x25f
[c046b987] get_empty_filp+0x99/0x15e
[f889d611] ext3_permission+0x0/0xa [ext3]
[f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
[c047a0dd] filldir+0x0/0xb9
[c0472973] sys_fstat64+0x1e/0x23
[c047a1f9] vfs_readdir+0x63/0x8d
[c047a0dd] filldir+0x0/0xb9
[c047a447] sys_getdents+0x5f/0x9c
[c0403eff] syscall_call+0x7/0xb
===
Code: 5d c3 b9 01 00 00 00 31 d2 6a 00 b8 f0 5a 41 c0 e8 2a ff ff ff fa 
e8 52 16 00 00 fb 58 c3 b8 54 3a 66 c0 e9 8e 6b 1e 00 53 89 c3 0f a3 
05 60 1f 6d c0 19 c0 85 c0 75 27 e8 bf db 00 00 50 68 55 
EIP: [c0415974] smp_send_reschedule+0x3/0x53 SS:ESP 0068:f4f2fc8c

0Kernel panic - not syncing: Fatal exception

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Måns Rullgård
Chris Boot [EMAIL PROTECTED] writes:

 All,

 I've got a box running RHEL5 and haven't been impressed by ext3
 performance on it (running of a 1.5TB HP MSA20 using the cciss
 driver). I compiled XFS as a module and tried it out since I'm used to
 using it on Debian, which runs much more efficiently. However, every
 so often the kernel panics as below. Apologies for the tainted kernel,
 but we run VMware Server on the box as well.

 Does anyone have any hits/tips for using XFS on Red Hat? What's
 causing the panic below, and is there a way around this?

 BUG: unable to handle kernel paging request at virtual address b8af9d60
 printing eip:
 c0415974
 *pde = 
 Oops:  [#1]
 SMP last sysfs file: /block/loop7/dev
 Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
 autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
 vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
 ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
 i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
 dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
 scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
 uhci_hcd
 CPU:1
 EIP:0060:[c0415974]Tainted: P  VLI
 EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
 smp_send_reschedule+0x3/0x53
 eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
 esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
 ds: 007b   es: 007b   ss: 0068
 Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
 Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
 000f  0001 0001 c200c6e0 0100 
 0069 0180 018fc500 c200d240 0003 0292 f601efc0
 f6027e00  0050 Call Trace:
 [c041dc23] try_to_wake_up+0x351/0x37b
 [f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
 [c04572f9] shrink_slab+0x56/0x13c
 [c0457c0c] try_to_free_pages+0x162/0x23e
 [c0454064] __alloc_pages+0x18d/0x27e
 [c045214e] find_or_create_page+0x53/0x8c
 [c046c7b1] __getblk+0x162/0x270
 [c0475be0] do_lookup+0x53/0x157
 [f889138f] ext3_getblk+0x7c/0x233 [ext3]
 [f88913fe] ext3_getblk+0xeb/0x233 [ext3]
 [c048215c] mntput_no_expire+0x11/0x6a
 [f889226e] ext3_bread+0x13/0x69 [ext3]
 [f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
 [f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
 [c047828b] do_path_lookup+0x20e/0x25f
 [c046b987] get_empty_filp+0x99/0x15e
 [f889d611] ext3_permission+0x0/0xa [ext3]
 [f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
 [c047a0dd] filldir+0x0/0xb9
 [c0472973] sys_fstat64+0x1e/0x23
 [c047a1f9] vfs_readdir+0x63/0x8d
 [c047a0dd] filldir+0x0/0xb9
 [c047a447] sys_getdents+0x5f/0x9c
 [c0403eff] syscall_call+0x7/0xb
 ===

Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

-- 
Måns Rullgård
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Jan Engelhardt

On Aug 18 2007 13:31, Måns Rullgård wrote:

 BUG: unable to handle kernel paging request at virtual address b8af9d60
 printing eip:
 c0415974
 *pde = 
 Oops:  [#1]
 SMP last sysfs file: /block/loop7/dev
 Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
 autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
 vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
 ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
 i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
 dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
 scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
 uhci_hcd
 CPU:1
 EIP:0060:[c0415974]Tainted: P  VLI
 EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
 smp_send_reschedule+0x3/0x53
 eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
 esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
 ds: 007b   es: 007b   ss: 0068
 Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
 Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
 000f  0001 0001 c200c6e0 0100 
 0069 0180 018fc500 c200d240 0003 0292 f601efc0
 f6027e00  0050 Call Trace:
 [c041dc23] try_to_wake_up+0x351/0x37b
 [f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
 [c04572f9] shrink_slab+0x56/0x13c
[...]

Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

I think we should include the vermagic string in oopses too, 
so that the flags SMP, PREEMPT, RT, 4KSTACKS, mod_unload, etc. are shown 
and the situation is a bit more apparent.



Jan
-- 

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot [EMAIL PROTECTED] writes:

  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
CPU:1
EIP:0060:[c0415974]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
smp_send_reschedule+0x3/0x53
eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
000f  0001 0001 c200c6e0 0100 
0069 0180 018fc500 c200d240 0003 0292 f601efc0
f6027e00  0050 Call Trace:
[c041dc23] try_to_wake_up+0x351/0x37b
[f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
[c04572f9] shrink_slab+0x56/0x13c
[c0457c0c] try_to_free_pages+0x162/0x23e
[c0454064] __alloc_pages+0x18d/0x27e
[c045214e] find_or_create_page+0x53/0x8c
[c046c7b1] __getblk+0x162/0x270
[c0475be0] do_lookup+0x53/0x157
[f889138f] ext3_getblk+0x7c/0x233 [ext3]
[f88913fe] ext3_getblk+0xeb/0x233 [ext3]
[c048215c] mntput_no_expire+0x11/0x6a
[f889226e] ext3_bread+0x13/0x69 [ext3]
[f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
[f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[c047828b] do_path_lookup+0x20e/0x25f
[c046b987] get_empty_filp+0x99/0x15e
[f889d611] ext3_permission+0x0/0xa [ext3]
[f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
[c047a0dd] filldir+0x0/0xb9
[c0472973] sys_fstat64+0x1e/0x23
[c047a1f9] vfs_readdir+0x63/0x8d
[c047a0dd] filldir+0x0/0xb9
[c047a447] sys_getdents+0x5f/0x9c
[c0403eff] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.
  
Thanks, that explains a lot. However, I don't have any XFS filesystems 
mounted over loop devices on ext3. Earlier in the day I had iso9660 on 
loop on xfs, could that have caused the issue? It was unmounted and 
deleted when this panic occurred.


I'll probably just try and recompile the kernel with 8k stacks and see 
how it goes. Screw the support, we're unlikely to get it anyway. :-P


Many thanks,
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Måns Rullgård
Chris Boot [EMAIL PROTECTED] writes:

 Måns Rullgård wrote:
 Chris Boot [EMAIL PROTECTED] writes:


 All,

 I've got a box running RHEL5 and haven't been impressed by ext3
 performance on it (running of a 1.5TB HP MSA20 using the cciss
 driver). I compiled XFS as a module and tried it out since I'm used to
 using it on Debian, which runs much more efficiently. However, every
 so often the kernel panics as below. Apologies for the tainted kernel,
 but we run VMware Server on the box as well.

 Does anyone have any hits/tips for using XFS on Red Hat? What's
 causing the panic below, and is there a way around this?

 BUG: unable to handle kernel paging request at virtual address b8af9d60
 printing eip:
 c0415974
 *pde = 
 Oops:  [#1]
 SMP last sysfs file: /block/loop7/dev
[...]
 [f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
 [c04572f9] shrink_slab+0x56/0x13c
 [c0457c0c] try_to_free_pages+0x162/0x23e
 [c0454064] __alloc_pages+0x18d/0x27e
 [c045214e] find_or_create_page+0x53/0x8c
 [c046c7b1] __getblk+0x162/0x270
 [c0475be0] do_lookup+0x53/0x157
 [f889138f] ext3_getblk+0x7c/0x233 [ext3]
 [f88913fe] ext3_getblk+0xeb/0x233 [ext3]
 [c048215c] mntput_no_expire+0x11/0x6a
 [f889226e] ext3_bread+0x13/0x69 [ext3]
 [f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
 [f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
 [c047828b] do_path_lookup+0x20e/0x25f
 [c046b987] get_empty_filp+0x99/0x15e
 [f889d611] ext3_permission+0x0/0xa [ext3]
 [f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
 [c047a0dd] filldir+0x0/0xb9
 [c0472973] sys_fstat64+0x1e/0x23
 [c047a1f9] vfs_readdir+0x63/0x8d
 [c047a0dd] filldir+0x0/0xb9
 [c047a447] sys_getdents+0x5f/0x9c
 [c0403eff] syscall_call+0x7/0xb
 ===


 Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
 seems to be enough to overflow it.

 Thanks, that explains a lot. However, I don't have any XFS filesystems
 mounted over loop devices on ext3. Earlier in the day I had iso9660 on
 loop on xfs, could that have caused the issue? It was unmounted and
 deleted when this panic occurred.

The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.

 I'll probably just try and recompile the kernel with 8k stacks and see
 how it goes. Screw the support, we're unlikely to get it anyway. :-P

Please report how this works out.

-- 
Måns Rullgård
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot [EMAIL PROTECTED] writes:

  

Måns Rullgård wrote:


Chris Boot [EMAIL PROTECTED] writes:


  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev


[...]
  

[f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
[c04572f9] shrink_slab+0x56/0x13c
[c0457c0c] try_to_free_pages+0x162/0x23e
[c0454064] __alloc_pages+0x18d/0x27e
[c045214e] find_or_create_page+0x53/0x8c
[c046c7b1] __getblk+0x162/0x270
[c0475be0] do_lookup+0x53/0x157
[f889138f] ext3_getblk+0x7c/0x233 [ext3]
[f88913fe] ext3_getblk+0xeb/0x233 [ext3]
[c048215c] mntput_no_expire+0x11/0x6a
[f889226e] ext3_bread+0x13/0x69 [ext3]
[f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
[f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[c047828b] do_path_lookup+0x20e/0x25f
[c046b987] get_empty_filp+0x99/0x15e
[f889d611] ext3_permission+0x0/0xa [ext3]
[f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
[c047a0dd] filldir+0x0/0xb9
[c0472973] sys_fstat64+0x1e/0x23
[c047a1f9] vfs_readdir+0x63/0x8d
[c047a0dd] filldir+0x0/0xb9
[c047a447] sys_getdents+0x5f/0x9c
[c0403eff] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

  

Thanks, that explains a lot. However, I don't have any XFS filesystems
mounted over loop devices on ext3. Earlier in the day I had iso9660 on
loop on xfs, could that have caused the issue? It was unmounted and
deleted when this panic occurred.



The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.
  


I am. The situation was earlier on was iso9660 on loop on xfs on lvm on 
cciss. I guess that might have smashed the stack undetectably and 
induced corruption encountered later on? When I experienced this panic 
the machine would have probably been performing a backup, which was 
simply a load of ext3/xfs filesystems on lvm on the HP cciss controller. 
None of the loop devices would have been mounted.


I have a few machines now with 4k stacks and using lvm + md + xfs and 
have no trouble at all, but none are Red Hat (all Debian) and none use 
cciss either. Maybe it's a deadly combination.



I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


Many thanks,
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Jan Engelhardt

On Aug 18 2007 17:28, Chris Boot wrote:

 I will. This will probably be on Monday now, since the machine isn't
 accepting SysRq requests over the serial console. :-(

Ah yeah, stupid null-modem cables!
You can also trigger sysrq from /proc/sysrq-trigger (well, as long
as the system lives)

Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/