On Fri, Aug 03, 2012 at 03:57:20PM +0000, Blue Swirl wrote: > >> > +static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void > >> > *arg) > >> > +{ > >> > + GlusterAIOCB *acb = (GlusterAIOCB *)arg; > >> > + BDRVGlusterState *s = acb->common.bs->opaque; > >> > + > >> > + acb->ret = ret; > >> > + if (qemu_gluster_send_pipe(s, acb) < 0) { > >> > + error_report("Could not complete read/write/flush from > >> > gluster"); > >> > + abort(); > >> > >> Aborting is a bit drastic, it would be nice to save and exit gracefully. > > > > I am not sure if there is an easy way to recover sanely and exit from this > > kind of error. > > > > Here the non-QEMU thread (gluster thread) failed to notify the QEMU thread > > on the read side of the pipe about the IO completion. So essentially > > bdrv_read or bdrv_write will never complete if this error happens. > > > > Do you have any suggestions on how to exit gracefully here ? > > Ignore but set the callback return to -EIO, see for example curl.c:249.
I see the precedence for how I am handling this in posix-aio-compat.c:posix_aio_notify_event(). So instead of aborting, I could do acb->common.cb(acb->common.opaque, -EIO) as you suggest, but that would not help because, the thread at the read side of the pipe is still waiting and user will see the read/write failure as hang. [root@bharata qemu]# gdb ./x86_64-softmmu/qemu-system-x86_64 Starting program: ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=gluster://bharata/test/F16,if=virtio,cache=none [New Thread 0x7ffff4c7f700 (LWP 6537)] [New Thread 0x7ffff447e700 (LWP 6538)] [New Thread 0x7ffff3420700 (LWP 6539)] [New Thread 0x7ffff1407700 (LWP 6540)] qemu-system-x86_64: -drive file=gluster://bharata/test/F16,if=virtio,cache=none: Could not complete read/write/flush from gluster ^C Program received signal SIGINT, Interrupt. 0x00007ffff60e9403 in select () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff60e9403 in select () from /lib64/libc.so.6 #1 0x00005555555baee3 in qemu_aio_wait () at aio.c:158 #2 0x00005555555cf57b in bdrv_rw_co (bs=0x5555564cfa50, sector_num=0, buf= 0x7fffffffb640 "\353c\220", nb_sectors=4, is_write=false) at block.c:1623 #3 0x00005555555cf5e1 in bdrv_read (bs=0x5555564cfa50, sector_num=0, buf= 0x7fffffffb640 "\353c\220", nb_sectors=4) at block.c:1633 #4 0x00005555555cf9d0 in bdrv_pread (bs=0x5555564cfa50, offset=0, buf=0x7fffffffb640, count1=2048) at block.c:1720 #5 0x00005555555cc8d4 in find_image_format (filename= 0x5555564cc290 "gluster://bharata/test/F16", pdrv=0x7fffffffbe60) at block.c:529 #6 0x00005555555cd303 in bdrv_open (bs=0x5555564cef20, filename= 0x5555564cc290 "gluster://bharata/test/F16", flags=98, drv=0x0) at block.c:800 #7 0x0000555555609f69 in drive_init (opts=0x5555564cf900, default_to_scsi=0) at blockdev.c:608 #8 0x0000555555711b6c in drive_init_func (opts=0x5555564cc1e0, opaque=0x555555c357a0) at vl.c:775 #9 0x000055555574ceda in qemu_opts_foreach (list=0x555555c319e0, func= 0x555555711b31 <drive_init_func>, opaque=0x555555c357a0, abort_on_failure=1) at qemu-option.c:1094 #10 0x0000555555719d78 in main (argc=9, argv=0x7fffffffe468, envp=0x7fffffffe4b8) at vl.c:3430