Re: [Lustre-discuss] Oops with 1.5.97 (on 2.6.18 kernel)

David Brown Wed, 21 Feb 2007 08:31:24 -0800

On 2/21/07, Alastair McKinstry <[EMAIL PROTECTED]> wrote:


 Hi,


While testing 1.5.97 on kernel 2.6.18, I had the following Oops on a client.
Has this been seen before?



Unable to handle kernel paging request at ffff88000b17fff8 RIP:
 [<ffffffff8843951a>] :osc:osc_brw_prep_request+0x3e3/0x9e2
PGD 1675067 PUD 1676067 PMD 16cf067 PTE 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables
osc mgc lustre lov lquota mdc ksocklnd ptlrpc obdclass lnet lvfs libcfs
bridge netloop ipv6 dm_snapshot dm_mirror dm_mod usbhid usbkbd ipmi_watchdog
ipmi_devintf ipmi_poweroff ipmi_si ipmi_msghandler dummy loop ide_generic
ide_disk evdev psmouse shpchp pci_hotplug pcspkr serio_raw i2c_piix4
i2c_core ext3 jbd mbcache sd_mod ide_cd cdrom sata_svw libata scsi_mod tg3
ehci_hcd generic ohci_hcd serverworks ide_core fan
Pid: 3638, comm: cat Not tainted 2.6.18-3-xen-amd64 #1
RIP: e030:[<ffffffff8843951a>] [<ffffffff8843951a>]
:osc:osc_brw_prep_request+0x3e3/0x9e2
RSP: e02b:ffff88000fbf1798 EFLAGS: 00010287
RAX: 0000000000000001 RBX: ffff88000eb79ce0 RCX: 0000000000000000
RDX: ffff88000b180000 RSI: ffff88000b180000 RDI: ffff880026c53cb0
RBP: ffff88005c6e4170 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88005c6e4088
R13: ffff88005c701a00 R14: ffff88005c58e498 R15: 0000000000000000
FS: 00002b0e147c86d0(0000) GS:ffffffff804c3000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000
Process cat (pid: 3638, threadinfo ffff88000fbf0000, task ffff8800626cc7f0)
Stack: ffff88000b180000 00000100802a6c2b ffff880026c53cb0 ffff88005c58e498
 0000000000000001 0000000000000010 0000001062948240 ffff880056946000
 0000000102a13e28 0000000400000000
Call Trace:
 [<ffffffff8843d611>] :osc:osc_send_oap_rpc+0x80a/0xe6d
 [<ffffffff8843ddc0>] :osc:osc_check_rpcs+0x14c/0x29b
 [<ffffffff88445751>] :osc:osc_queue_async_io+0xc2b/0xd0a
 [<ffffffff8820e41e>]
:libcfs:libcfs_debug_vmsg2+0x600/0x897
 [<ffffffff883a0d38>] :lov:lov_queue_async_io+0x2f1/0x3a1
 [<ffffffff883f996f>]
:lustre:queue_or_sync_write+0x2b6/0xc43
 [<ffffffff883fcc99>] :lustre:ll_commit_write+0x269/0x5df
 [<ffffffff80210ba1>]
generic_file_buffered_write+0x438/0x646
 [<ffffffff883b742f>]
:lov:lov_update_enqueue_set+0x345/0x3a1
 [<ffffffff8020f13c>] current_fs_time+0x3b/0x40
 [<ffffffff884424b5>] :osc:osc_enqueue+0xfb/0x48e
 [<ffffffff8021646f>]
__generic_file_aio_write_nolock+0x2e4/0x32f
 [<ffffffff883ef54b>]
:lustre:ll_inode_size_unlock+0x81/0xd6
 [<ffffffff802a2809>] __generic_file_write_nolock+0x8f/0xa8
 [<ffffffff882e4886>] :ptlrpc:ldlm_completion_ast+0x0/0x5e2
 [<ffffffff80290415>] autoremove_wake_function+0x0/0x2e
 [<ffffffff88408367>] :lustre:lt_get_mmap_locks+0x2b1/0x3a5
 [<ffffffff8024529d>] generic_file_write+0x49/0xa7
 [<ffffffff883ea8f3>] :lustre:ll_file_write+0x669/0x7e7
 [<ffffffff80216b9b>] vfs_write+0xce/0x174
 [<ffffffff802173be>] sys_write+0x45/0x6e
 [<ffffffff8025c81a>] system_call+0x86/0x8b
 [<ffffffff8025c794>] system_call+0x0/0x8b


Note: the client was running on a Xen dom0, with fairly small memory, so
lustre is not necessarily at fault.
Further details on request.


Yeah I found running lustre in xen is kinda difficult when trying to
partition out the memory, I use at least 512M for all lustre involved
components of my xen/lustre systems.

- David Brown

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Oops with 1.5.97 (on 2.6.18 kernel)

Reply via email to