Alle 23:36, lunedì 21 novembre 2005, Dave Kleikamp ha scritto:
>
> A very similar problem was reported before, but I never did figure out
> what the cause was:
> http://www.mail-archive.com/[email protected]/msg00250.h
>tml
>
> I run gentoo myself and haven't seen it firsthand. I've tried
> remounting between read-only & read-write under a number of different
> circumstances, but I still can't recreate the problem.
>
> What modules do you have loaded? That seemed to make a difference.
Well, the list of loaded modules is in the report below; anyway I've noticed
that a similar oop (or BUG) happens often at boot, maybe it can be related to
log replay. On this machine I change quite often kernel (at least, at every
MM release) and I've noticed that reboot is very "dangerous" moment for my
jfs volumes. Say, several reboots with rc4-mm1, no problems. Complied
rc4-mm2, booted with BUG. rebooted back with rc4-mm1, BUG. it has no pattern
that I can find, but it seems related to some strange state of written data
that confuses jfs code.
Last time, something happened not during boot but where unmerging after glibc
merging (so after heavy HD load): unmerge got stuck executing rm (D state)
and after reboot all files related to glibc were corrupted or damaged (even
lib*so symlinks), maybe jfs was hit hard and was unable to wrote down data in
a right way.
This happens with mm series kernel, I've no data for vanilla kernel.
today I've find the following BUG on logs of the same machine, but the system
is still up & running (more or less :) ), so if needed it's possible to
collect more data.
------------[ cut here ]------------
kernel BUG at fs/jfs/jfs_logmgr.c:1622!
invalid operand: 0000 [#1]
PREEMPT SMP
last sysfs file: /class/net/tun1/ifindex
Modules linked in: iptable_nat ip_nat ipt_state ip_conntrack ipt_REJECT
iptable_filter ip_tables parport_pc parport floppy pcspkr rtc ohci1394
ieee1394 i2c_i801 xfs nvidia tun crc32 rfcomm l2cap bluetooth snd_pcm_oss
snd_mixer_oss snd_emu10k1_synth snd_emux_synth snd_seq_virmidi
snd_seq_midi_event snd_seq_midi_emul snd_seq snd_emu10k1 snd_rawmidi
snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_ac97_bus snd_page_alloc
snd_util_mem snd_hwdep snd soundcore ide_cd i2c_dev w83627hf hwmon_vid
i2c_isa uhci_hcd ehci_hcd usbcore isofs zlib_inflate e1000
CPU: 0
EIP: 0060:[<c01eeae6>] Tainted: P VLI
EFLAGS: 00010292 (2.6.15-rc1-mm2)
EIP is at jfs_flush_journal+0x1dc/0x27e
eax: 00000044 ebx: 000000c8 ecx: 00000001 edx: 00000202
esi: f7d72c34 edi: f7707e7c ebp: f7d72b80 esp: f7707e44
ds: 007b es: 007b ss: 0068
Process mount (pid: 2408, threadinfo=f7706000 task=c193e030)
Stack: c0349a6e c0349e97 00000656 c0349e7e f7d72c04 f7d72c18 f880e110 f880e0c8
00000002 00000000 c193e030 c0117a34 00000000 00000000 c1906bac 00000000
f7d12a80 00000000 c193e030 c0117a34 00100100 00200200 00000000 00000000
Call Trace:
[<c0117a34>] default_wake_function+0x0/0xc
[<c0117a34>] default_wake_function+0x0/0xc
[<c01d746f>] jfs_umount_rw+0x23/0x6c
[<c01d3789>] jfs_remount+0x14f/0x17e
[<c01607a4>] do_remount_sb+0xc1/0x13e
[<c01751de>] do_remount+0x82/0xd5
[<c0175b50>] do_mount+0x1c0/0x1fc
[<c013ff27>] __alloc_pages+0x45/0x28b
[<c01758af>] exact_copy_from_user+0x26/0x5b
[<c017593a>] copy_mount_options+0x56/0xac
[<c0175ecf>] sys_mount+0x79/0xb5
[<c0102c3f>] sysenter_past_esp+0x54/0x75
Code: ff ff e9 95 fe ff ff c7 44 24 0c 7e 9e 34 c0 c7 44 24 08 56 06 00 00 c7
44 24 04 97 9e 34 c0 c7 04 24 6e 9a 34 c0 e8 ea de f2 ff <0f> 0b 56 06 97 9e
34 c0 e9 4f ff ff ff 8d 54 24 24 fc b9 05 00
Badness in do_exit at kernel/exit.c:795
[<c011f0f4>] do_exit+0x42f/0x434
[<c0103f53>] do_divide_error+0x0/0x96
[<c0104145>] do_invalid_op+0x0/0x99
[<c01041d5>] do_invalid_op+0x90/0x99
[<c01eeae6>] jfs_flush_journal+0x1dc/0x27e
[<c011cd96>] release_console_sem+0xb5/0xb7
[<c011cb95>] vprintk+0x1aa/0x285
[<c0125347>] try_to_del_timer_sync+0x45/0x4d
[<c0125359>] del_timer_sync+0xa/0x10
[<c0103837>] error_code+0x4f/0x54
[<c01eeae6>] jfs_flush_journal+0x1dc/0x27e
[<c0117a34>] default_wake_function+0x0/0xc
[<c0117a34>] default_wake_function+0x0/0xc
[<c01d746f>] jfs_umount_rw+0x23/0x6c
[<c01d3789>] jfs_remount+0x14f/0x17e
[<c01607a4>] do_remount_sb+0xc1/0x13e
[<c01751de>] do_remount+0x82/0xd5
[<c0175b50>] do_mount+0x1c0/0x1fc
[<c013ff27>] __alloc_pages+0x45/0x28b
[<c01758af>] exact_copy_from_user+0x26/0x5b
[<c017593a>] copy_mount_options+0x56/0xac
[<c0175ecf>] sys_mount+0x79/0xb5
[<c0102c3f>] sysenter_past_esp+0x54/0x75
Modules loaded during previous BUG:
iptable_nat
ip_nat
ipt_state
ip_conntrack
ipt_REJECT
iptable_filter
ip_tables
parport_pc
parport
floppy
pcspkr
rtc
ohci1394
ieee1394
i2c_i801
xfs
nvidia
tun
crc32
rfcomm
l2cap
bluetooth
snd_pcm_oss
snd_mixer_oss
snd_emu10k1_synth
snd_emux_synth
snd_seq_virmidi
snd_seq_midi_event
snd_seq_midi_emul
snd_seq
snd_emu10k1
snd_rawmidi
snd_seq_device
snd_ac97_codec
snd_pcm
snd_timer
snd_ac97_bus
snd_page_alloc
snd_util_mem
snd_hwdep
snd
soundcore
ide_cd
i2c_dev
w83627hf
hwmon_vid
i2c_isa
uhci_hcd
ehci_hcd
usbcore
isofs
zlib_inflate
e1000
>
> > At boot I've got this:
> >
> > BUG at fs/jfs/jfs_logmgr.c:1622 assert(list_empty(&log->cqueue))
> > ------------[ cut here ]------------
> > kernel BUG at fs/jfs/jfs_logmgr.c:1622!
> > invalid operand: 0000 [#1]
> > PREEMPT SMP
> > last sysfs file: /class/net/tun1/ifindex
> > Modules linked in: iptable_nat ip_nat ipt_state ip_conntrack ipt_REJECT
> > iptable_filter ip_tables parport_pc parport floppy pcspkr rtc ohci1394
> > ieee1394 i2c_i801 xfs nvidia tun crc32 rfcomm l2cap bluetooth snd_pcm_oss
> > snd_mixer_oss snd_emu10k1_synth snd_emux_synth snd_seq_virmidi
> > snd_seq_midi_event snd_seq_midi_emul snd_seq snd_emu10k1 snd_rawmidi
> > snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_ac97_bus
> > snd_page_alloc snd_util_mem snd_hwdep snd soundcore ide_cd i2c_dev
> > w83627hf hwmon_vid i2c_isa uhci_hcd ehci_hcd usbcore isofs zlib_inflate
> > e1000
> > CPU: 0
> > EIP: 0060:[<c01eeae6>] Tainted: P VLI
> > EFLAGS: 00010292 (2.6.15-rc1-mm2)
> > EIP is at jfs_flush_journal+0x1dc/0x27e
> > eax: 00000044 ebx: 000000c8 ecx: 00000001 edx: 00000202
> > esi: f7d72c34 edi: f7707e7c ebp: f7d72b80 esp: f7707e44
> > ds: 007b es: 007b ss: 0068
> > Process mount (pid: 2408, threadinfo=f7706000 task=c193e030)
> > Stack: c0349a6e c0349e97 00000656 c0349e7e f7d72c04 f7d72c18 f880e110
> > f880e0c8 00000002 00000000 c193e030 c0117a34 00000000 00000000 c1906bac
> > 00000000 f7d12a80 00000000 c193e030 c0117a34 00100100 00200200 00000000
> > 00000000 Call Trace:
> > [<c0117a34>] default_wake_function+0x0/0xc
> > [<c0117a34>] default_wake_function+0x0/0xc
> > [<c01d746f>] jfs_umount_rw+0x23/0x6c
> > [<c01d3789>] jfs_remount+0x14f/0x17e
> > [<c01607a4>] do_remount_sb+0xc1/0x13e
> > [<c01751de>] do_remount+0x82/0xd5
> > [<c0175b50>] do_mount+0x1c0/0x1fc
> > [<c013ff27>] __alloc_pages+0x45/0x28b
> > [<c01758af>] exact_copy_from_user+0x26/0x5b
> > [<c017593a>] copy_mount_options+0x56/0xac
> > [<c0175ecf>] sys_mount+0x79/0xb5
> > [<c0102c3f>] sysenter_past_esp+0x54/0x75
> > Code: ff ff e9 95 fe ff ff c7 44 24 0c 7e 9e 34 c0 c7 44 24 08 56 06 00
> > 00 c7 44 24 04 97 9e 34 c0 c7 04 24 6e 9a 34 c0 e8 ea de f2 ff <0f> 0b 56
> > 06 97 9e 34 c0 e9 4f ff ff ff 8d 54 24 24 fc b9 05 00
> > Badness in do_exit at kernel/exit.c:795
> > [<c011f0f4>] do_exit+0x42f/0x434
> > [<c0103f53>] do_divide_error+0x0/0x96
> > [<c0104145>] do_invalid_op+0x0/0x99
> > [<c01041d5>] do_invalid_op+0x90/0x99
> > [<c01eeae6>] jfs_flush_journal+0x1dc/0x27e
> > [<c011cd96>] release_console_sem+0xb5/0xb7
> > [<c011cb95>] vprintk+0x1aa/0x285
> > [<c0125347>] try_to_del_timer_sync+0x45/0x4d
> > [<c0125359>] del_timer_sync+0xa/0x10
> > [<c0103837>] error_code+0x4f/0x54
> > [<c01eeae6>] jfs_flush_journal+0x1dc/0x27e
> > [<c0117a34>] default_wake_function+0x0/0xc
> > [<c0117a34>] default_wake_function+0x0/0xc
> > [<c01d746f>] jfs_umount_rw+0x23/0x6c
>
> This confused me when the problem was first reported. The stack
> indicates that the partition is being mounted read-only after being
> mounted read-write. Normally, the root should be mounted read-only (if
> grub.conf is set up right) and after fsck is called, it should be
> remounted read-write. It would not normally be mounted back to
> read-only until the system is shut down.
>
> > [<c01d3789>] jfs_remount+0x14f/0x17e
> > [<c01607a4>] do_remount_sb+0xc1/0x13e
> > [<c01751de>] do_remount+0x82/0xd5
> > [<c0175b50>] do_mount+0x1c0/0x1fc
> > [<c013ff27>] __alloc_pages+0x45/0x28b
> > [<c01758af>] exact_copy_from_user+0x26/0x5b
> > [<c017593a>] copy_mount_options+0x56/0xac
> > [<c0175ecf>] sys_mount+0x79/0xb5
> > [<c0102c3f>] sysenter_past_esp+0x54/0x75
--
Fabio "Cova" Coatti http://members.ferrara.linux.it/cova
Ferrara Linux Users Group http://ferrara.linux.it
GnuPG fp:9765 A5B6 6843 17BC A646 BE8C FA56 373A 5374 C703
Old SysOps never die... they simply forget their password.
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_idv28&alloc_id845&op=click
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion