Zoe,
I just wanted to confirm that I can reproduce the problem. I'm not familiar
with the code but it appears related to kernel paging mishaps. I'll
attach a couple of the oops I got if anyone is familar and feels like
taking a look.
I also attached the test code I'm using based on the man page you
reference below (no guarantees on correctness). These oops were with the
CVS trunk PVFS on a 32 bit 2.6.18-164.6.1 kernel.
Michael
On Tue, Feb 23, 2010 at 01:55:19PM +0200, Zoe Sebepou wrote:
> Hello,
>
> I'm trying to use libaio but the client module crashes resulting in kernel
> panic both in pvfs2 versions 2.8.1 and 2.8.2.
> The libaio version used is 0.3.107-3, and my kernel version is 2.6.18.8
> x86_64.
>
> You can reproduce the problem using a simple copy test from the following
> man page: http://man.cx/io(3)
> Also, I have noticed that writing asynchronously works without any problem.
> The crash appears when I attempt to issue read requests asynchronously.
> In my code, the calls used are the io_submit and the io_getevents.
>
> Thank you in advance,
> --
> Zoe Sebepou
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:65!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /class/pvfs2/pvfs2-req/dev
Modules linked in: pvfs2(U) ipv6 xfrm_nalgo crypto_api vboxvfs(U) dm_multipath
scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac lp floppy
pcspkr pcnet32 mii vboxadd(U) i2c_piix4 i2c_core ide_cd cdrom parport_pc
parport serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot
dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd
uhci_hcd ohci_hcd ehci_hcd
CPU: 0
EIP: 0060:[<c04eefd8>] Tainted: G VLI
EFLAGS: 00000046 (2.6.18-164.6.1.el5 #1)
EIP is at list_del+0x18/0x5c
eax: 00000048 ebx: d936d000 ecx: 00000094 edx: 00000000
esi: dff62300 edi: d936d0c0 ebp: dff31e80 esp: dffdff0c
ds: 007b es: 007b ss: 0068
Process events/0 (pid: 5, ti=dffdf000 task=dfca5550 task.ti=dffdf000)
Stack: c0646e4d d936d000 00000000 d936d000 c04704db dff30c00 00000008 00000000
dff30c14 dff30c14 00000008 dff30c00 dff62300 c04705c8 00000000 dff31e80
dff62300 dff31e80 dfc8d540 00000286 c0471961 00000000 00000000 c1406100
Call Trace:
[<c04704db>] free_block+0x6a/0xe3
[<c04705c8>] drain_array+0x74/0x95
[<c0471961>] cache_reap+0x45/0x100
[<c0431e8a>] run_workqueue+0x78/0xb5
[<c047191c>] cache_reap+0x0/0x100
[<c043273e>] worker_thread+0xd9/0x10b
[<c041e727>] default_wake_function+0x0/0xc
[<c0432665>] worker_thread+0x0/0x10b
[<c0434b55>] kthread+0xc0/0xeb
[<c0434a95>] kthread+0x0/0xeb
[<c0405c53>] kernel_thread_helper+0x7/0x10
=======================
Code: 51 04 8d 46 0c 5b 5e 5f e9 62 00 00 00 89 c3 eb eb 90 90 53 89 c3 8b 40
04 8b 00 39 d8 74 17 50 53 68 4d 6e 64 c0 e8 1b 5e f3 ff <0f> 0b 41 00 8a 6e 64
c0 83 c4 0c 8b 03 8b 40 04 39 d8 74 17 50
EIP: [<c04eefd8>] list_del+0x18/0x5c SS:ESP 0068:dffdff0c
<0>Kernel panic - not syncing: Fatal exception
swap_dup: Bad swap file entry c0418376
VM: killing process pvfs2-client-co
Bad pte = e9b89f27, process = ???, vm_flags = 75, vaddr = 800000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = fc67783b, process = ???, vm_flags = 75, vaddr = 802000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 2c78c1bc, process = ???, vm_flags = 75, vaddr = 803000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = da49edb0, process = ???, vm_flags = 75, vaddr = 804000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 2df52a2b, process = ???, vm_flags = 75, vaddr = 805000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 77adbb6b, process = ???, vm_flags = 75, vaddr = 807000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
swap_free: Bad swap file entry 40617324
Bad pte = d6bbfef0, process = ???, vm_flags = 75, vaddr = 809000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 705edb88, process = ???, vm_flags = 75, vaddr = 80a000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 4b7ab684, process = ???, vm_flags = 75, vaddr = 80b000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = f48f0821, process = ???, vm_flags = 75, vaddr = 80c000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Bad pte = 8d537d5f, process = ???, vm_flags = 75, vaddr = 80e000
[<c0461cdf>] vm_normal_page+0x5d/0x72
[<c046266e>] unmap_vmas+0x1d5/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Eeek! page_mapcount(page) went negative! (-1)
page->flags = 80000004
page->count = 0
page->mapping = 00000000
------------[ cut here ]------------
kernel BUG at mm/rmap.c:589!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /class/pvfs2/pvfs2-req/dev
Modules linked in: pvfs2(U) ipv6 xfrm_nalgo crypto_api vboxvfs(U) dm_multipath
scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac lp floppy
pcspkr i2c_piix4 i2c_core ide_cd cdrom serio_raw parport_pc parport pcnet32 mii
vboxadd(U) dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero
dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd
ohci_hcd ehci_hcd
CPU: 0
EIP: 0060:[<c04682ea>] Tainted: G VLI
EFLAGS: 00210246 (2.6.18-164.6.1.el5 #1)
EIP is at page_remove_rmap+0x66/0xc0
eax: 0000001e ebx: c12daf00 ecx: 00200094 edx: 00200000
esi: c12daf00 edi: 0080f000 ebp: d9f3b03c esp: d9f27eb8
ds: 007b es: 007b ss: 0069
Process pvfs2-client-co (pid: 1494, ti=d9f27000 task=df84a000 task.ti=d9f27000)
Stack: c063caf7 00000000 c063cae0 00000000 16d78a20 c12daf00 c046278c 00000000
de503614 d9f27f3c 00000000 00000001 de55b900 00888000 d9f0d008 de55b900
c1404580 00000000 ffffffff de55b948 d9f0d008 002f1f7f 00888000 00000000
Call Trace:
[<c046278c>] unmap_vmas+0x2f3/0x5cc
[<c046554e>] exit_mmap+0x77/0xee
[<c0422431>] mmput+0x25/0x69
[<c042703d>] do_exit+0x20c/0x794
[<c0619836>] do_page_fault+0x46a/0x4e1
[<c06193cc>] do_page_fault+0x0/0x4e1
[<c0405a89>] error_code+0x39/0x40
=======================
Code: 40 02 00 83 c4 10 3d 00 40 02 00 75 03 8b 53 0c 8b 42 04 50 68 e0 ca 63
c0 e8 16 cb fb ff ff 73 10 68 f7 ca 63 c0 e8 09 cb fb ff <0f> 0b 4d 02 8c ca 63
c0 83 c4 10 8b 53 10 89 d8 83 f2 01 83 e2
EIP: [<c04682ea>] page_remove_rmap+0x66/0xc0 SS:ESP 0069:d9f27eb8
<0>Kernel panic - not syncing: Fatal exception
BUG: unable to handle kernel paging request at virtual address 38863d8b
printing eip:
c04eefc6
*pde = 00000000
Oops: 0000 [#1]
SMP
last sysfs file: /class/pvfs2/pvfs2-req/dev
Modules linked in: pvfs2(U) ipv6 xfrm_nalgo crypto_api vboxvfs(U) dm_multipath
scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac lp floppy
pcspkr serio_raw ide_cd cdrom pcnet32 mii vboxadd(U) i2c_piix4 i2c_core
parport_pc parport dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot
dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd
uhci_hcd ohci_hcd ehci_hcd
CPU: 0
EIP: 0060:[<c04eefc6>] Tainted: G VLI
EFLAGS: 00010092 (2.6.18-164.6.1.el5 #1)
EIP is at list_del+0x6/0x5c
eax: 38863d8b ebx: db401000 ecx: 00000005 edx: 00000000
esi: dfcc5d00 edi: db401118 ebp: dfcc8180 esp: dffdff18
ds: 007b es: 007b ss: 0068
Process events/0 (pid: 5, ti=dffdf000 task=dfca5550 task.ti=dffdf000)
Stack: db401000 c04704db dfcc2800 00000005 00000000 dfcc2814 dfcc2814 00000005
dfcc2800 dfcc5d00 c04705c8 00000000 dfcc8180 dfcc5d00 dfcc8180 dfc8d540
00000286 c0471961 00000000 00000000 c1406100 c1406104 c0431e8a c047191c
Call Trace:
[<c04704db>] free_block+0x6a/0xe3
[<c04705c8>] drain_array+0x74/0x95
[<c0471961>] cache_reap+0x45/0x100
[<c0431e8a>] run_workqueue+0x78/0xb5
[<c047191c>] cache_reap+0x0/0x100
[<c043273e>] worker_thread+0xd9/0x10b
[<c041e727>] default_wake_function+0x0/0xc
[<c0432665>] worker_thread+0x0/0x10b
[<c0434b55>] kthread+0xc0/0xeb
[<c0434a95>] kthread+0x0/0xeb
[<c0405c53>] kernel_thread_helper+0x7/0x10
=======================
Code: 8d 4b 04 8b 51 04 8d 46 04 e8 73 00 00 00 8d 4b 0c 8b 51 04 8d 46 0c 5b
5e 5f e9 62 00 00 00 89 c3 eb eb 90 90 53 89 c3 8b 40 04 <8b> 00 39 d8 74 17 50
53 68 4d 6e 64 c0 e8 1b 5e f3 ff 0f 0b 41
EIP: [<c04eefc6>] list_del+0x6/0x5c SS:ESP 0068:dffdff18
<0>Kernel panic - not syncing: Fatal exception
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <fcntl.h>
#include <errno.h>
#include <libaio.h>
#define AIO_BLKSIZE (64*1024)
#define AIO_MAXIO 32
static int busy = 0; // # of I/O's in flight
static int tocopy = 0; // # of blocks left to copy
static int dstfd = -1; // destination file descriptor
static const char *dstname = NULL;
static const char *srcname = NULL;
/* Fatal error handler */
static void io_error(const char *func, int rc)
{
if (rc == -ENOSYS)
fprintf(stderr, "AIO not in this kernel\n");
else if (rc < 0 && -rc < sys_nerr)
fprintf(stderr, "%s: %s\n", func, sys_errlist[-rc]);
else
fprintf(stderr, "%s: error %d\n", func, rc);
if (dstfd > 0)
close(dstfd);
if (dstname)
unlink(dstname);
exit(1);
}
/*
* Write complete callback.
* Adjust counts and free resources
*/
static void wr_done(io_context_t ctx, struct iocb *iocb, long res, long res2)
{
if (res2 != 0) {
io_error("aio write", res2);
}
if (res != iocb->u.c.nbytes) {
fprintf(stderr, "write missed bytes expect %d got %d\n",
iocb->u.c.nbytes, res2);
exit(1);
}
--tocopy;
--busy;
free(iocb->u.c.buf);
memset(iocb, 0xff, sizeof(iocb)); // paranoia
free(iocb);
write(2, "w", 1);
}
/*
* Read complete callback.
* Change read iocb into a write iocb and start it.
*/
static void rd_done(io_context_t ctx, struct iocb *iocb, long res, long res2)
{
/* library needs accessors to look at iocb? */
int iosize = iocb->u.c.nbytes;
char *buf = iocb->u.c.buf;
off_t offset = iocb->u.c.offset;
if (res2 != 0)
io_error("aio read", res2);
if (res != iosize)
{
fprintf(stderr, "read missing bytes expect %d got %d\n",
iocb->u.c.nbytes, res);
exit(1);
}
/* turn read into write */
io_prep_pwrite(iocb, dstfd, buf, iosize, offset);
io_set_callback(iocb, wr_done);
if (1 != (res = io_submit(ctx, 1, &iocb)))
io_error("io_submit write", res);
write(2, "r", 1);
}
int main(int argc, char *const *argv)
{
int srcfd;
struct stat st;
off_t length = 0, offset = 0;
io_context_t myctx;
if (argc != 3 || argv[1][0] == '-') {
fprintf(stderr, "Usage: aiocp SOURCE DEST");
exit(1);
}
if ((srcfd = open(srcname = argv[1], O_RDONLY)) < 0) {
perror(srcname);
exit(1);
}
if (fstat(srcfd, &st) < 0) {
perror("fstat");
exit(1);
}
length = st.st_size;
if ((dstfd = open(dstname = argv[2], O_WRONLY | O_CREAT, 0666)) < 0)
{
close(srcfd);
perror(dstname);
exit(1);
}
/* initialize state machine */
memset(&myctx, 0, sizeof(myctx));
io_queue_init(AIO_MAXIO, &myctx);
tocopy = howmany(length, AIO_BLKSIZE);
while (tocopy > 0)
{
int i, rc;
/* Submit as many reads as once as possible upto AIO_MAXIO */
int n = MIN(MIN(AIO_MAXIO - busy, AIO_MAXIO / 2),
howmany(length - offset, AIO_BLKSIZE));
if (n > 0)
{
struct iocb *ioq[n];
for (i = 0; i < n; i++)
{
struct iocb *io = (struct iocb *) malloc(sizeof(struct iocb));
int iosize = MIN(length - offset, AIO_BLKSIZE);
char *buf = (char *) malloc(iosize);
if (NULL == buf || NULL == io)
{
fprintf(stderr, "out of memory\n");
exit(1);
}
io_prep_pread(io, srcfd, buf, iosize, offset);
io_set_callback(io, rd_done);
ioq[i] = io;
offset += iosize;
}
rc = io_submit(myctx, n, ioq);
if (rc < 0)
io_error("io_submit", rc);
busy += n;
}
// Handle IO's that have completed
rc = io_queue_run(myctx);
if (rc < 0)
io_error("io_queue_run", rc);
// if we have maximum number of i/o's in flight
// then wait for one to complete
if (busy == AIO_MAXIO) {
rc = io_queue_wait(myctx, NULL);
if (rc < 0)
io_error("io_queue_wait", rc);
}
}
close(srcfd);
close(dstfd);
exit(0);
}
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users