Hi
Daniel wrote:
> Got an oops this morning on one of our production servers:
>
> SMP Box, stock 2.4.10 kernel all modules compiled in except ethernets.
> ksymoops output follows:
>
> ksymoops 2.4.0 on i686 2.4.10. Options used
> -v /usr/src/linux/vmlinux (specified)
> -k /proc/ksyms (default)
> -l /proc/modules (default)
> -o /lib/modules/2.4.10/ (default)
> -m /boot/System.map-2.4.10 (default)
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000018
> c019fddf
> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c019fddf>]
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010246
> eax: f7e74800 ebx: 00000000 ecx: 00000dd3 edx: 6100a443
> esi: 00000001 edi: f6e667b8 ebp: 0001edcc esp: db5b3a78
> ds: 0018 es: 0018 ss: 0018
> Process urchin (pid: 4067, stackpage=db5b3000)
> Stack: 00000000 f73952a0 00000079 c0183bde f7e74800 000067b8 c0183b7f
> f7e74800
> 00000000 00000001 db5b3e1c f7e74800 f6e667b8 f745b000 f73ee380
> f73e8000
> f6e667b8 00000000 cdbdd8dc 000001c8 c019913a db5b3e1c f6e667b8
> ffffffff
> Call Trace: [<c0183bde>] [<c0183b7f>] [<c019913a>] [<c0190756>]
> [<c0199c4a>]
> [<c019a322>] [<c018c14c>] [<c0128147>] [<c018d1c0>] [<c018d1cc>]
> [<c0126082>]
> [<c014da36>] [<c018d2e6>] [<c014dbce>] [<c0142c20>] [<c0135917>]
> [<c0144089>]
> [<c0136776>] [<c0136ac4>] [<c0106f1b>]
> Code: 8b 43 18 a9 00 00 10 00 0f 84 b3 ff ff ff 85 f6 74 0a f6 43
>
> >>EIP; c019fddf <reiserfs_prepare_for_journal+5f/90> <=====
> Trace; c0183bde <reiserfs_free_block+be/d0>
> Trace; c0183b7f <reiserfs_free_block+5f/d0>
> Trace; c019913a <prepare_for_delete_or_cut+71a/800>
> Trace; c0190756 <fix_nodes+226/450>
> Trace; c0199c4a <reiserfs_cut_from_item+ba/410>
> Trace; c019a322 <reiserfs_do_truncate+322/440>
> Trace; c018c14c <reiserfs_truncate_file+ac/1e0>
> Trace; c0128147 <truncate_inode_pages+67/80>
> Trace; c018d1c0 <reiserfs_vfs_truncate_file+0/10>
> Trace; c018d1cc <reiserfs_vfs_truncate_file+c/10>
> Trace; c0126082 <vmtruncate+172/1a0>
> Trace; c014da36 <inode_setattr+26/e0>
> Trace; c018d2e6 <reiserfs_setattr+56/60>
> Trace; c014dbce <notify_change+6e/f0>
> Trace; c0142c20 <cached_lookup+10/50>
> Trace; c0135917 <do_truncate+47/60>
> Trace; c0144089 <open_namei+419/540>
> Trace; c0136776 <filp_open+36/60>
> Trace; c0136ac4 <sys_open+34/c0>
> Trace; c0106f1b <system_call+33/38>
> Code; c019fddf <reiserfs_prepare_for_journal+5f/90>
> 00000000 <_EIP>:
> Code; c019fddf <reiserfs_prepare_for_journal+5f/90> <=====
> 0: 8b 43 18 mov 0x18(%ebx),%eax <=====
> Code; c019fde2 <reiserfs_prepare_for_journal+62/90>
> 3: a9 00 00 10 00 test $0x100000,%eax
> Code; c019fde7 <reiserfs_prepare_for_journal+67/90>
> 8: 0f 84 b3 ff ff ff je ffffffc1 <_EIP+0xffffffc1>
> c019fda0 <reiserfs_prepare_for_journal+20/90>
> Code; c019fded <reiserfs_prepare_for_journal+6d/90>
> e: 85 f6 test %esi,%esi
> Code; c019fdef <reiserfs_prepare_for_journal+6f/90>
> 10: 74 0a je 1c <_EIP+0x1c> c019fdfb
> <reiserfs_prepare_for_journal+7b/90>
> Code; c019fdf1 <reiserfs_prepare_for_journal+71/90>
> 12: f6 43 00 00 testb $0x0,0x0(%ebx)
>
> Any ideas?
>
>From the dump you provided - it looks like reiserfs oopsed because it did not
expect that there will be an attempt to free block which does not exist on a
device.
This can happen if file being deleted/truncated had corrupted pointers to data
blocks. You should check your filesystem.
The attached patch should prevent the oops in future.
Thanks,
vs
>
> --
> --
> Daniel Bohling
> NewsFactor Network
--- bitmap.c.orig Tue Oct 9 11:45:11 2001
+++ bitmap.c Tue Oct 9 11:45:11 2001
@@ -123,6 +123,13 @@
get_bit_address (s, block, &nr, &offset);
+ if (nr >= SB_BMAP_NR (s)) {
+ reiserfs_warning ("vs-4075: reiserfs_free_block: "
+ "blocknr is out of device (%04x:%lu)[dev:blocknr]\n",
+ s->s_dev, block);
+ return;
+ }
+
/* mark it before we clear it, just in case */
journal_mark_freed(th, s, block) ;