On Mon, 25 Mar 2013 08:42:57 +0100 Gerd Hoffmann <kra...@redhat.com> wrote:
> On 03/22/13 22:39, Luiz Capitulino wrote: > > On Fri, 22 Mar 2013 16:50:39 -0400 > > Luiz Capitulino <lcapitul...@redhat.com> wrote: > > > >> On Fri, 22 Mar 2013 10:17:58 +0100 > >> KONRAD Frédéric <fred.kon...@greensocs.com> wrote: > >> > >>> Hi, > >>> > >>> Seems there is an issue with the current git (found by toddf on IRC). > >>> > >>> To reproduce: > >>> > >>> ./qemu-system-x86_64 --monitor stdio --nographic > >>> > >>> and put "?" it should abort. > >>> > >>> Here is the backtrace: > >>> > >>> #0 0x00007f77cd347935 in raise () from /lib64/libc.so.6 > >>> #1 0x00007f77cd3490e8 in abort () from /lib64/libc.so.6 > >>> #2 0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6 > >>> #3 0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6 > >>> #4 0x00007f77d1c1f226 in monitor_puts (mon=<optimized out>, > >>> str=<optimized out>) at > >> > >> Yes, it's easy to reproduce. Bisect says: > >> > >> f628926bb423fa8a7e0b114511400ea9df38b76a is the first bad commit > >> commit f628926bb423fa8a7e0b114511400ea9df38b76a > >> Author: Gerd Hoffmann <kra...@redhat.com> > >> Date: Tue Mar 19 10:57:56 2013 +0100 > >> > >> fix monitor > >> > >> chardev flow control broke monitor, fix it by adding watch support. > >> > >> Signed-off-by: Anthony Liguori <aligu...@us.ibm.com> > >> > >> My impression is that monitor_puts() in being called in parallel. > > > > Not all. > > > > What's happening is that qemu_chr_fe_write() is returning < 0, > > mon->outbuf_index is not reset and is full, this causes the assert in > > monitor_puts() to trig. > > > > The previous version of monitor_flush() ignores errors, and everything > > works, so doing the same thing here fixes the problem :) > > No, ignoring errors breaks qmp because the output isn't valid json any > more when you cut off something ... What you mean "when you cut off"? When the other side disconnects? Do we care? > > For some reason I'm unable to see what the error code is. Gerd, do you think > > the patch below is reasonable? If it's not, how should we handle errors > > here? > > No, it's not. > > Ignoring the error for errno = EAGAIN breaks flow control. > > Ignoring the error for errno != EAGAIN (and maybe logging a debug > message) would be ok, but I suspect it's actually EAGAIN you get here. > > Just go for a larger buffer? That's simple, but it's not a real fix. We hit that problem because the help output is a large one. I'd guess that this is easily reproduced with something like QIDL, which (iirc) generates long json output on QMP. Looks like we need a dynamic buffer there. Other ideas?