Zoe,

I just wanted to confirm that I can reproduce the problem. I'm not familiar 
with the code but it appears related to kernel paging mishaps. I'll 
attach a couple of the oops I got if anyone is familar and feels like 
taking a look.

I also attached the test code I'm using based on the man page you 
reference below (no guarantees on correctness). These oops were with the 
CVS trunk PVFS on a 32 bit 2.6.18-164.6.1 kernel.

Michael

On Tue, Feb 23, 2010 at 01:55:19PM +0200, Zoe Sebepou wrote:
> Hello,
> 
> I'm trying to use libaio but the client module crashes resulting in kernel
> panic both in pvfs2 versions 2.8.1 and 2.8.2.
> The libaio version used is 0.3.107-3, and my kernel version is 2.6.18.8
> x86_64.
> 
> You can reproduce the problem using a simple copy test from the following
> man page: http://man.cx/io(3)
> Also, I have noticed that writing asynchronously works without any problem.
> The crash appears when I attempt to issue read requests asynchronously.
> In my code, the calls used are the io_submit and the io_getevents.
> 
> Thank you in advance,
> -- 
> Zoe Sebepou

> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

------------[ cut here ]------------
kernel BUG at lib/list_debug.c:65!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /class/pvfs2/pvfs2-req/dev
Modules linked in: pvfs2(U) ipv6 xfrm_nalgo crypto_api vboxvfs(U) dm_multipath 
scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac lp floppy 
pcspkr pcnet32 mii vboxadd(U) i2c_piix4 i2c_core ide_cd cdrom parport_pc 
parport serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot 
dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd 
uhci_hcd ohci_hcd ehci_hcd
CPU:    0
EIP:    0060:[<c04eefd8>]    Tainted: G      VLI
EFLAGS: 00000046   (2.6.18-164.6.1.el5 #1) 
EIP is at list_del+0x18/0x5c
eax: 00000048   ebx: d936d000   ecx: 00000094   edx: 00000000
esi: dff62300   edi: d936d0c0   ebp: dff31e80   esp: dffdff0c
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 5, ti=dffdf000 task=dfca5550 task.ti=dffdf000)
Stack: c0646e4d d936d000 00000000 d936d000 c04704db dff30c00 00000008 00000000 
dff30c14 dff30c14 00000008 dff30c00 dff62300 c04705c8 00000000 dff31e80 
dff62300 dff31e80 dfc8d540 00000286 c0471961 00000000 00000000 c1406100 
Call Trace:
[<c04704db>] free_block+0x6a/0xe3
[<c04705c8>] drain_array+0x74/0x95
[<c0471961>] cache_reap+0x45/0x100
[<c0431e8a>] run_workqueue+0x78/0xb5
[<c047191c>] cache_reap+0x0/0x100
[<c043273e>] worker_thread+0xd9/0x10b
[<c041e727>] default_wake_function+0x0/0xc
[<c0432665>] worker_thread+0x0/0x10b
[<c0434b55>] kthread+0xc0/0xeb
[<c0434a95>] kthread+0x0/0xeb
[<c0405c53>] kernel_thread_helper+0x7/0x10
=======================
Code: 51 04 8d 46 0c 5b 5e 5f e9 62 00 00 00 89 c3 eb eb 90 90 53 89 c3 8b 40 
04 8b 00 39 d8 74 17 50 53 68 4d 6e 64 c0 e8 1b 5e f3 ff <0f> 0b 41 00 8a 6e 64 
c0 83 c4 0c 8b 03 8b 40 04 39 d8 74 17 50 
EIP: [<c04eefd8>] list_del+0x18/0x5c SS:ESP 0068:dffdff0c
<0>Kernel panic - not syncing: Fatal exception

swap_dup: Bad swap file entry c0418376
VM: killing process pvfs2-client-co
Bad pte = e9b89f27, process = ???, vm_flags = 75, vaddr = 800000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = fc67783b, process = ???, vm_flags = 75, vaddr = 802000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 2c78c1bc, process = ???, vm_flags = 75, vaddr = 803000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = da49edb0, process = ???, vm_flags = 75, vaddr = 804000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 2df52a2b, process = ???, vm_flags = 75, vaddr = 805000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 77adbb6b, process = ???, vm_flags = 75, vaddr = 807000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
swap_free: Bad swap file entry 40617324
Bad pte = d6bbfef0, process = ???, vm_flags = 75, vaddr = 809000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 705edb88, process = ???, vm_flags = 75, vaddr = 80a000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 4b7ab684, process = ???, vm_flags = 75, vaddr = 80b000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = f48f0821, process = ???, vm_flags = 75, vaddr = 80c000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 8d537d5f, process = ???, vm_flags = 75, vaddr = 80e000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Eeek! page_mapcount(page) went negative! (-1)
page->flags = 80000004
page->count = 0
page->mapping = 00000000
------------[ cut here ]------------
kernel BUG at mm/rmap.c:589!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /class/pvfs2/pvfs2-req/dev
Modules linked in: pvfs2(U) ipv6 xfrm_nalgo crypto_api vboxvfs(U) dm_multipath 
scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac lp floppy 
pcspkr i2c_piix4 i2c_core ide_cd cdrom serio_raw parport_pc parport pcnet32 mii 
vboxadd(U) dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero 
dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd 
ohci_hcd ehci_hcd
CPU:    0
EIP:    0060:[<c04682ea>]    Tainted: G      VLI
EFLAGS: 00210246   (2.6.18-164.6.1.el5 #1) 
EIP is at page_remove_rmap+0x66/0xc0
eax: 0000001e   ebx: c12daf00   ecx: 00200094   edx: 00200000
esi: c12daf00   edi: 0080f000   ebp: d9f3b03c   esp: d9f27eb8
ds: 007b   es: 007b   ss: 0069
Process pvfs2-client-co (pid: 1494, ti=d9f27000 task=df84a000 task.ti=d9f27000)
Stack: c063caf7 00000000 c063cae0 00000000 16d78a20 c12daf00 c046278c 00000000 
de503614 d9f27f3c 00000000 00000001 de55b900 00888000 d9f0d008 de55b900 
c1404580 00000000 ffffffff de55b948 d9f0d008 002f1f7f 00888000 00000000 
Call Trace:
[<c046278c>] unmap_vmas+0x2f3/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Code: 40 02 00 83 c4 10 3d 00 40 02 00 75 03 8b 53 0c 8b 42 04 50 68 e0 ca 63 
c0 e8 16 cb fb ff ff 73 10 68 f7 ca 63 c0 e8 09 cb fb ff <0f> 0b 4d 02 8c ca 63 
c0 83 c4 10 8b 53 10 89 d8 83 f2 01 83 e2 
EIP: [<c04682ea>] page_remove_rmap+0x66/0xc0 SS:ESP 0069:d9f27eb8
<0>Kernel panic - not syncing: Fatal exception


BUG: unable to handle kernel paging request at virtual address 38863d8b
printing eip:
c04eefc6
*pde = 00000000
Oops: 0000 [#1]
SMP 
last sysfs file: /class/pvfs2/pvfs2-req/dev
Modules linked in: pvfs2(U) ipv6 xfrm_nalgo crypto_api vboxvfs(U) dm_multipath 
scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac lp floppy 
pcspkr serio_raw ide_cd cdrom pcnet32 mii vboxadd(U) i2c_piix4 i2c_core 
parport_pc parport dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot 
dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd 
uhci_hcd ohci_hcd ehci_hcd
CPU:    0
EIP:    0060:[<c04eefc6>]    Tainted: G      VLI
EFLAGS: 00010092   (2.6.18-164.6.1.el5 #1) 
EIP is at list_del+0x6/0x5c
eax: 38863d8b   ebx: db401000   ecx: 00000005   edx: 00000000
esi: dfcc5d00   edi: db401118   ebp: dfcc8180   esp: dffdff18
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 5, ti=dffdf000 task=dfca5550 task.ti=dffdf000)
Stack: db401000 c04704db dfcc2800 00000005 00000000 dfcc2814 dfcc2814 00000005 
dfcc2800 dfcc5d00 c04705c8 00000000 dfcc8180 dfcc5d00 dfcc8180 dfc8d540 
00000286 c0471961 00000000 00000000 c1406100 c1406104 c0431e8a c047191c 
Call Trace:
[<c04704db>] free_block+0x6a/0xe3
[<c04705c8>] drain_array+0x74/0x95
[<c0471961>] cache_reap+0x45/0x100
[<c0431e8a>] run_workqueue+0x78/0xb5
[<c047191c>] cache_reap+0x0/0x100
[<c043273e>] worker_thread+0xd9/0x10b
[<c041e727>] default_wake_function+0x0/0xc
[<c0432665>] worker_thread+0x0/0x10b
[<c0434b55>] kthread+0xc0/0xeb
[<c0434a95>] kthread+0x0/0xeb
[<c0405c53>] kernel_thread_helper+0x7/0x10
=======================
Code: 8d 4b 04 8b 51 04 8d 46 04 e8 73 00 00 00 8d 4b 0c 8b 51 04 8d 46 0c 5b 
5e 5f e9 62 00 00 00 89 c3 eb eb 90 90 53 89 c3 8b 40 04 <8b> 00 39 d8 74 17 50 
53 68 4d 6e 64 c0 e8 1b 5e f3 ff 0f 0b 41 
EIP: [<c04eefc6>] list_del+0x6/0x5c SS:ESP 0068:dffdff18
<0>Kernel panic - not syncing: Fatal exception

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <fcntl.h>
#include <errno.h>

#include <libaio.h>

#define AIO_BLKSIZE (64*1024)
#define AIO_MAXIO   32

static int busy = 0;          // # of I/O's in flight
static int tocopy = 0;        // # of blocks left to copy
static int dstfd = -1;        // destination file descriptor
static const char *dstname = NULL;
static const char *srcname = NULL;


/* Fatal error handler */
static void io_error(const char *func, int rc)
{
    if (rc == -ENOSYS)
        fprintf(stderr, "AIO not in this kernel\n");
    else if (rc < 0 && -rc < sys_nerr)
        fprintf(stderr, "%s: %s\n", func, sys_errlist[-rc]);
    else
        fprintf(stderr, "%s: error %d\n", func, rc);

    if (dstfd > 0)
        close(dstfd);
    if (dstname)
        unlink(dstname);
    exit(1);
}

/*
* Write complete callback.
* Adjust counts and free resources
*/
static void wr_done(io_context_t ctx, struct iocb *iocb, long res, long res2)
{
    if (res2 != 0) {
        io_error("aio write", res2);
    }
    if (res != iocb->u.c.nbytes) {
        fprintf(stderr, "write missed bytes expect %d got %d\n", 
                iocb->u.c.nbytes, res2);
        exit(1);
    }
    --tocopy;
    --busy;
    free(iocb->u.c.buf);

    memset(iocb, 0xff, sizeof(iocb));   // paranoia
    free(iocb);
    write(2, "w", 1);
}

/*
* Read complete callback.
* Change read iocb into a write iocb and start it.
*/
static void rd_done(io_context_t ctx, struct iocb *iocb, long res, long res2)
{
    /* library needs accessors to look at iocb? */
    int iosize = iocb->u.c.nbytes;
    char *buf = iocb->u.c.buf;
    off_t offset = iocb->u.c.offset;
    
    if (res2 != 0)
        io_error("aio read", res2);
    if (res != iosize) 
    {
        fprintf(stderr, "read missing bytes expect %d got %d\n", 
                iocb->u.c.nbytes, res);
        exit(1);
    }


    /* turn read into write */
    io_prep_pwrite(iocb, dstfd, buf, iosize, offset);

    io_set_callback(iocb, wr_done);

    if (1 != (res = io_submit(ctx, 1, &iocb)))
        io_error("io_submit write", res);
    write(2, "r", 1);
}


int main(int argc, char *const *argv)
{
    int srcfd;
    struct stat st;
    off_t length = 0, offset = 0;
    io_context_t myctx;

    if (argc != 3 || argv[1][0] == '-') {
        fprintf(stderr, "Usage: aiocp SOURCE DEST");
        exit(1);
    }
    if ((srcfd = open(srcname = argv[1], O_RDONLY)) < 0) {
        perror(srcname);
        exit(1);
    }
    if (fstat(srcfd, &st) < 0) {
        perror("fstat");
        exit(1);
    }
    length = st.st_size;

    if ((dstfd = open(dstname = argv[2], O_WRONLY | O_CREAT, 0666)) < 0)
    {
        close(srcfd);
        perror(dstname);
        exit(1);
    }

    /* initialize state machine */
    memset(&myctx, 0, sizeof(myctx));
    io_queue_init(AIO_MAXIO, &myctx);
    tocopy = howmany(length, AIO_BLKSIZE);

    while (tocopy > 0) 
    {
        int i, rc;
        /* Submit as many reads as once as possible upto AIO_MAXIO */
        int n = MIN(MIN(AIO_MAXIO - busy, AIO_MAXIO / 2),
        howmany(length - offset, AIO_BLKSIZE));
        if (n > 0) 
        {
            struct iocb *ioq[n];
            for (i = 0; i < n; i++) 
            {
                struct iocb *io = (struct iocb *) malloc(sizeof(struct iocb));
                int iosize = MIN(length - offset, AIO_BLKSIZE);
                char *buf = (char *) malloc(iosize);

                if (NULL == buf || NULL == io) 
                {
                    fprintf(stderr, "out of memory\n");
                    exit(1);
                }

                io_prep_pread(io, srcfd, buf, iosize, offset);
                io_set_callback(io, rd_done);
                ioq[i] = io;
                offset += iosize;
            }

            rc = io_submit(myctx, n, ioq);
            if (rc < 0)
                io_error("io_submit", rc);

            busy += n;
        }

        // Handle IO's that have completed
        rc = io_queue_run(myctx);
        if (rc < 0)
            io_error("io_queue_run", rc);

        // if we have maximum number of i/o's in flight
        // then wait for one to complete
        if (busy == AIO_MAXIO) {
            rc = io_queue_wait(myctx, NULL);
            if (rc < 0)
                io_error("io_queue_wait", rc);
        }

    }

    close(srcfd);
    close(dstfd);
    exit(0);
}

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to