--On Monday, December 06, 2004 07:48:52 PM +0100 Jeffrey Altman
<[EMAIL PROTECTED]> wrote:

> The thing which is preventing the release of 1.3.7x as a stable 1.4
> tree is lack of deployment and testing by users.  There has been very
> little feedback both positive or negative on the existing releases.
> Without this feedback it is very difficult for us to know whether or
> not it is ready.

  I'd been holding back our feedback because 1.3.75 was imminent and some
of the fixes listed we though might fix our problems.  We've done testing
with 1.3.74 and 1.3.75.  The clients are all Fedora Core 3 w/ patched
kernels to provide sys_call_table[].  We are experiencing the following
problems:


  * Inability to unmount /usr/vice/cache (or / if it's not a separate
partition).  This is 100% repeatable on all FC3 machines.  The following
steps will always create this problem:

      - Stop all processes and logout all users of AFS
      - Stop all AFS processes and unload libafs kernel module
      - lsof | grep -i afs reports nothing open
      - umount /usr/vice/cache

This will always result in an error that /usr/vice/cache is busy:

      # umount /usr/vice/cache
      umount: /usr/vice/cache: device is busy
      umount: /usr/vice/cache: device is busy

  * Accessing an AFS volume over our VPN results in an immediate kernel
panic.  The panic message reports many "Unable to handle kernel NULL
pointer deference at virtual address" errors followed by "Recursive die()
failure, output suppressed" and "<0>Kernel panic - not syncing: Fatal
exception in interrupt".  This is present only on 1 of 2 laptops running
FC3, but is 100% repeatable on the failing laptop.

  * Copying large files (~450Mb0 into AFS from non-AFS partitions results
in a kernel oops.  The error reported is:

   rxi_Start: xmit list overflowed<1>Unable to handle kernel paging request
at virtual address ffffffff

This problem is also 100% repeatable.  'fs getcache' does not report that
the cache is full.  I've attached a file gti-largefile-copy-oops.txt that
is the "soft" kernel oops.

  * Random cache consistency problems.  A file will be present in the
filesystem and viewable on other machines but not on the FC3 host.  fs
flush does not always solve this problem however another client operating
on the same directory (i.e. touch hi) seems to unstick the client.  We do
have one test case that seems to always generate this problem, but it's not
very portable for other to test as it requires our internal package
management software.  Rudy Maceyko is going to test this with 1.3.75
shortly.

  These are our current problems with the 1.3.7x series.  We have not
tested 1.3.7x on any other Linux release because we're focusing on moving
forward with Fedora 3 and RHEL 4 preparations.  So I can't speak to these
problems existing on, for example, FC1.

  We are building the RPMs with a modified specfile.  We're working to
merge our changes back into the mainline spec file and provide that to the
community.  I've attached all of the patches we're applying to the source
tree since they're all small.  Their descriptions are:

  openafs-1.2.11-no_old_gid_t.patch - Support for AMD 64

  openafs-1.2.11-res_search.patch - resolver patch

  openafs-1.3.75-afskvers-autoconf-fix.patch - Fix --with-afs-system

  26syscall.patch - Hard-sets the build process to use sys_call_table

  afs.initd.patch - Removes modload logic in favor of symlinks 
                    to /lib/modules

  openafs-krb5-2.0-afsconf.patch - Fixes call to afsconf_AddKey() 
                                   for afs-krb5

I've held off reporting this for a little bit because I've not had time to
properly test or debug any of these.  Let me know what we can do to further
debug these problems.

-- 
Jason McCormick
CERT Infrastructure Team
[EMAIL PROTECTED] ** 412-268-7961
Dec 10 14:48:48 gti kernel: rxi_Start: xmit list overflowed<1>Unable to handle 
kernel paging request at virtual address ffffffff
Dec 10 14:48:48 gti kernel:  printing eip:
Dec 10 14:48:48 gti kernel: 12fac54c
Dec 10 14:48:48 gti kernel: *pde = 00002067
Dec 10 14:48:48 gti kernel: Oops: 0002 [#1]
Dec 10 14:48:48 gti kernel: Modules linked in: libafs(U) cisco_ipsec(U) i2c_dev 
i2c_core ipt_REJECT ipt_LOG ipt_state ip_conntrack iptable_filter ip_tables 
orinoco_cs orinoco hermes ds microcode dm_mod button battery ac ohci1394 
ieee1394 yenta_socket pcmcia_core uhci_hcd snd_intel8x0m snd_intel8x0 
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc 
gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore 3c59x floppy 
ext3 jbd
Dec 10 14:48:48 gti kernel: CPU:    0
Dec 10 14:48:48 gti kernel: EIP:    0060:[<12fac54c>]    Tainted: P   VLI
Dec 10 14:48:48 gti kernel: EFLAGS: 00010212   (2.6.9-1.681.CERT) 
Dec 10 14:48:48 gti kernel: EIP is at osi_Panic+0x17/0x23 [libafs]
Dec 10 14:48:48 gti kernel: eax: 0000001f   ebx: 12fc731e   ecx: 12fc6fbc   
edx: 06a9ea5c
Dec 10 14:48:48 gti kernel: esi: 0326fc80   edi: 12ff81b0   ebp: 00000007   
esp: 06a9ea58
Dec 10 14:48:48 gti kernel: ds: 007b   es: 007b   ss: 0068
Dec 10 14:48:48 gti kernel: Process cp (pid: 3640, threadinfo=06a9e000 
task=0f24f7b0)
Dec 10 14:48:48 gti kernel: Stack: 12fc6fbc 00000020 12fd9420 00000000 12fe6ce0 
12fa9119 00000000 100a83c0 
Dec 10 14:48:48 gti kernel:        00000007 41b9fda0 00052259 41b9fd9f 000efdc9 
0326fc80 06865d94 12ffa840 
Dec 10 14:48:48 gti kernel:        12ffa1e0 12fab212 00000000 00001000 12fb596c 
00000574 00001000 0000026c 
Dec 10 14:48:48 gti kernel: Call Trace:
Dec 10 14:48:48 gti kernel:  [<12fa9119>] rxi_Start+0x2dc/0x4f4 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fab212>] rxi_WriteProc+0x15c/0x350 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fb596c>] afs_osi_Read+0x4b/0x8f [libafs]
Dec 10 14:48:48 gti kernel:  [<12f7eb30>] afs_UFSCacheStoreProc+0xe6/0x185 
[libafs]
Dec 10 14:48:48 gti kernel:  [<0218564b>] iget_locked+0x167/0x206
Dec 10 14:48:48 gti kernel:  [<12f887dd>] afs_StoreAllSegments+0x8b3/0x1843 
[libafs]
Dec 10 14:48:48 gti kernel:  [<1286014e>] ext3_file_write+0x19/0x8b [ext3]
Dec 10 14:48:48 gti kernel:  [<12fba9ad>] afs_linux_writepage_sync+0xb0/0x1b7 
[libafs]
Dec 10 14:48:48 gti kernel:  [<12fbaa28>] afs_linux_writepage_sync+0x12b/0x1b7 
[libafs]
Dec 10 14:48:48 gti kernel:  [<0215222e>] follow_page_pte+0xec/0xfd
Dec 10 14:48:48 gti kernel:  [<12fbaac3>] afs_linux_updatepage+0xf/0x11 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fbab94>] afs_linux_commit_write+0xcf/0x167 
[libafs]
Dec 10 14:48:48 gti kernel:  [<02144825>] 
generic_file_buffered_write+0x301/0x48e
Dec 10 14:48:48 gti kernel:  [<02128c29>] update_wall_time+0x9/0x31
Dec 10 14:48:48 gti kernel:  [<02108bf4>] free_irq+0xf/0x1a0
Dec 10 14:48:48 gti kernel:  [<0215222e>] follow_page_pte+0xec/0xfd
Dec 10 14:48:48 gti kernel:  [<0215e907>] rw_vm+0x3ef/0x47a
Dec 10 14:48:48 gti kernel:  [<02144ce8>] 
generic_file_aio_write_nolock+0x336/0x364
Dec 10 14:48:48 gti kernel:  [<02144d9a>] generic_file_write_nolock+0x84/0x99
Dec 10 14:48:48 gti kernel:  [<021c3fc2>] avc_has_perm+0x3b/0x45
Dec 10 14:48:48 gti kernel:  [<12f9112b>] afs_CopyOutAttrs+0x1df/0x1e5 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fb70ec>] vcache2inode+0x21/0x27 [libafs]
Dec 10 14:48:48 gti kernel:  [<0211d26f>] autoremove_wake_function+0x0/0x2d
Dec 10 14:48:48 gti kernel:  [<02144ed6>] generic_file_write+0x5a/0xbb
Dec 10 14:48:48 gti kernel:  [<12fb789b>] afs_linux_write+0x48b/0x5b1 [libafs]
Dec 10 14:48:48 gti kernel:  [<02145b31>] mempool_free+0x169/0x16d
Dec 10 14:48:48 gti kernel:  [<02165c82>] vfs_write+0xb6/0xe2
Dec 10 14:48:48 gti kernel:  [<02165d4c>] sys_write+0x3c/0x62
Dec 10 14:48:48 gti kernel: Code: <3>Debug: sleeping function called from 
invalid context at include/linux/rwsem.h:43
Dec 10 14:48:48 gti kernel: in_atomic():0[expected: 0], irqs_disabled():1
Dec 10 14:48:48 gti kernel:  [<0211cbcb>] __might_sleep+0x7d/0x8a
Dec 10 14:48:48 gti kernel:  [<0215e726>] rw_vm+0x20e/0x47a
Dec 10 14:48:48 gti kernel:  [<12fac521>] rxi_GetHostUDPSocket+0x19/0x23 
[libafs]
Dec 10 14:48:48 gti kernel:  [<12fac521>] rxi_GetHostUDPSocket+0x19/0x23 
[libafs]
Dec 10 14:48:48 gti kernel:  [<0215ee70>] get_user_size+0x30/0x57
Dec 10 14:48:48 gti kernel:  [<12fac521>] rxi_GetHostUDPSocket+0x19/0x23 
[libafs]
Dec 10 14:48:48 gti kernel:  [<0210682b>] show_registers+0x109/0x15e
Dec 10 14:48:48 gti kernel:  [<02106a2f>] die+0x14a/0x241
Dec 10 14:48:48 gti kernel:  [<0211937e>] do_page_fault+0x0/0x511
Dec 10 14:48:48 gti kernel:  [<0211937e>] do_page_fault+0x0/0x511
Dec 10 14:48:48 gti kernel:  [<02119733>] do_page_fault+0x3b5/0x511
Dec 10 14:48:48 gti kernel:  [<12fac54c>] osi_Panic+0x17/0x23 [libafs]
Dec 10 14:48:48 gti kernel:  [<0211b15f>] activate_task+0x53/0x5f
Dec 10 14:48:48 gti kernel:  [<0211d27c>] autoremove_wake_function+0xd/0x2d
Dec 10 14:48:48 gti kernel:  [<0211bbeb>] __wake_up_common+0x36/0x51
Dec 10 14:48:48 gti kernel:  [<0211bc93>] __wake_up+0x8d/0xf2
Dec 10 14:48:48 gti kernel:  [<0211937e>] do_page_fault+0x0/0x511
Dec 10 14:48:48 gti kernel:  [<12fac54c>] osi_Panic+0x17/0x23 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fa9119>] rxi_Start+0x2dc/0x4f4 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fab212>] rxi_WriteProc+0x15c/0x350 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fb596c>] afs_osi_Read+0x4b/0x8f [libafs]
Dec 10 14:48:48 gti kernel:  [<12f7eb30>] afs_UFSCacheStoreProc+0xe6/0x185 
[libafs]
Dec 10 14:48:48 gti kernel:  [<0218564b>] iget_locked+0x167/0x206
Dec 10 14:48:48 gti kernel:  [<12f887dd>] afs_StoreAllSegments+0x8b3/0x1843 
[libafs]
Dec 10 14:48:48 gti kernel:  [<1286014e>] ext3_file_write+0x19/0x8b [ext3]
Dec 10 14:48:48 gti kernel:  [<12fba9ad>] afs_linux_writepage_sync+0xb0/0x1b7 
[libafs]
Dec 10 14:48:48 gti kernel:  [<12fbaa28>] afs_linux_writepage_sync+0x12b/0x1b7 
[libafs]
Dec 10 14:48:48 gti kernel:  [<0215222e>] follow_page_pte+0xec/0xfd
Dec 10 14:48:48 gti kernel:  [<12fbaac3>] afs_linux_updatepage+0xf/0x11 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fbab94>] afs_linux_commit_write+0xcf/0x167 
[libafs]
Dec 10 14:48:48 gti kernel:  [<02144825>] 
generic_file_buffered_write+0x301/0x48e
Dec 10 14:48:48 gti kernel:  [<02128c29>] update_wall_time+0x9/0x31
Dec 10 14:48:48 gti kernel:  [<02108bf4>] free_irq+0xf/0x1a0
Dec 10 14:48:48 gti kernel:  [<0215222e>] follow_page_pte+0xec/0xfd
Dec 10 14:48:48 gti kernel:  [<0215e907>] rw_vm+0x3ef/0x47a
Dec 10 14:48:48 gti kernel:  [<02144ce8>] 
generic_file_aio_write_nolock+0x336/0x364
Dec 10 14:48:48 gti kernel:  [<02144d9a>] generic_file_write_nolock+0x84/0x99
Dec 10 14:48:48 gti kernel:  [<021c3fc2>] avc_has_perm+0x3b/0x45
Dec 10 14:48:48 gti kernel:  [<12f9112b>] afs_CopyOutAttrs+0x1df/0x1e5 [libafs]
Dec 10 14:48:48 gti kernel:  [<12fb70ec>] vcache2inode+0x21/0x27 [libafs]
Dec 10 14:48:48 gti kernel:  [<0211d26f>] autoremove_wake_function+0x0/0x2d
Dec 10 14:48:48 gti kernel:  [<02144ed6>] generic_file_write+0x5a/0xbb
Dec 10 14:48:48 gti kernel:  [<12fb789b>] afs_linux_write+0x48b/0x5b1 [libafs]
Dec 10 14:48:48 gti kernel:  [<02145b31>] mempool_free+0x169/0x16d
Dec 10 14:48:48 gti kernel:  [<02165c82>] vfs_write+0xb6/0xe2
Dec 10 14:48:48 gti kernel:  [<02165d4c>] sys_write+0x3c/0x62
Dec 10 14:48:48 gti kernel:  Bad EIP value.

Attachment: 26syscall.patch
Description: Binary data

Attachment: afs.initd.patch
Description: Binary data

Attachment: openafs-1.2.11-no_old_gid_t.patch
Description: Binary data

Attachment: openafs-1.2.11-res_search.patch
Description: Binary data

Attachment: openafs-1.3.74-admin_tools.klog.patch
Description: Binary data

Attachment: openafs-krb5-2.0-afsconf.patch
Description: Binary data

Attachment: openafs-1.3.75-afskvers-autoconf-fix.patch
Description: Binary data

Reply via email to