Hi Bill,
This is really bad..I wish I had a system to repro your setup..
Is there something special in your kernel .config?
(PREEMPT for example)
What distro is this on btw?
thanks,
Murali
On 8/24/07, Bill Wichser <[EMAIL PROTECTED]> wrote:
> We have been experiencing frequent crashes in the PVFS2 kernel module
> when applications use standard system I/O to write to PVFS2 files. We
> are running the Linux 2.6.9-55.0.2 smp kernel, and PVFS2 v2.6.3.
> The general protection fault almost always occurs at
> pvfs2_devreq_writev+351.
>
> In our build, the invalid reference specifically occurs in the
> qhash_del() operation, within the inline qhash_search_and_remove()
> function called by pvfs2_devreq_writev(). See excerpts below:
>
> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> static ssize_t pvfs2_devreq_writev(
> struct file *file,
> const struct iovec *iov,
> unsigned long count,
> loff_t * offset)
> {
> .
> .
> .
> /* lookup (and remove) the op based on the tag */
> hash_link = qhash_search_and_remove(htable_ops_in_progress, &(tag));
> if (hash_link)
> {
> .
> .
> .
> }
>
> /* qhash_search_and_remove()
> *
> * searches for and removes a link in the hash table
> * that matches the given key
> *
> * returns pointer to link on success, NULL on failure (or item
> * not found). On success, link is removed from hashtable.
> */
> static inline struct qhash_head *qhash_search_and_remove(
> struct qhash_table *table,
> void *key)
> {
> int index = 0;
> struct qhash_head *tmp_link = NULL;
>
> /* find the hash value */ index = table->hash(key,
> table->table_size);
>
> /* linear search at index to find match */
> qhash_lock(&table->lock);
> qhash_for_each(tmp_link, &(table->array[index]))
> {
> if (table->compare(key, tmp_link))
> {
> qhash_del(tmp_link);
> qhash_unlock(&table->lock);
> return (tmp_link);
> }
> }
> qhash_unlock(&table->lock);
> return (NULL);
> }
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> We have since run pvfs2-fsck on the file system and have found some
> corruption. So we're not sure if what we're seeing is just a
> second-order effect of the corruption, or is the actual cause of the
> corruption.
>
> So we're passing this along to you to see if you've had any similar
> reports, or can point us in the right direction to help find the
> problem.
>
> The crash file sys and bt info follows. Please let us know if you need
> more information.
>
> Thanks,
> Bill
>
> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> crash> sys
> SYSTEM MAP: /boot/System.map-2.6.9-55.0.2.ELsmp
> DEBUG KERNEL: /home/jsbillin/vmlinux-2.6.9-55.ELsmp (2.6.9-55.ELsmp)
> DUMPFILE: /var/crash/172.18.0.85-2007-08-06-07:47/vmcore
> CPUS: 4
> DATE: Fri Aug 17 11:53:17 2007
> UPTIME: 11 days, 04:08:05
> LOAD AVERAGE: 2.49, 2.10, 1.63
> TASKS: 96
> NODENAME: woodhen-085
> RELEASE: 2.6.9-55.0.2.ELsmp
> VERSION: #1 SMP Mon Jun 25 14:12:33 EDT 2007
> MACHINE: x86_64 (2660 Mhz)
> MEMORY: 9 GB
> PANIC: ""
> crash> bt
> PID: 3454 TASK: 10236b63030 CPU: 0 COMMAND: "pvfs2-client-co"
> #0 [10232fbbc60] netpoll_start_netdump at ffffffffa0249366
> #1 [10232fbbc90] die at ffffffff80111c00
> #2 [10232fbbcb0] do_general_protection at ffffffff801124e5
> #3 [10232fbbcf0] error_exit at ffffffff80110d91
> [exception RIP: pvfs2_devreq_writev+351]
> RIP: ffffffffa0226948 RSP: 0000010232fbbda8 RFLAGS: 00010246
> RAX: 0000000000000000 RBX: 40903a138d84f800 RCX: 0000000000000000
> RDX: 40903a138d84f800 RSI: 00000101aeab1bd8 RDI: 0000010232fbbdc0
> RBP: 0000010006bccd40 R8: 0000000000000000 R9: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 000001020557e600 R14: 000001020557e5f0 R15: 0000010232fbbe88
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #4 [10232fbbda0] pvfs2_devreq_writev at ffffffffa022693e
> #5 [10232fbbe00] sock_readv_writev at ffffffff802a91f9
> #6 [10232fbbe60] do_readv_writev at ffffffff8017a45f
> #7 [10232fbbf40] sys_writev at ffffffff8017a631
> #8 [10232fbbf80] system_call at ffffffff8011026a
> RIP: 00000035854bfcdb RSP: 0000007fbffff228 RFLAGS: 00010202
> RAX: 0000000000000014 RBX: ffffffff8011026a RCX: 00000035854bf1e9
> RDX: 0000000000000004 RSI: 0000007fbffff120 RDI: 0000000000000005
> RBP: 0000000000000000 R8: 0000000000000001 R9: 0000000000000004
> R10: 0000000000000001 R11: 0000000000000206 R12: 0000000000000005
> R13: 0000007fbffff120 R14: 0000000000000004 R15: 0000000000000000
> ORIG_RAX: 0000000000000014 CS: 0033 SS: 002b
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users