Re: [Qemu-devel] qemu-ga behavior on virtio-serial unplug

2013-08-13 Thread Amit Shah
Hi,

(Sorry for the late reply!)

On (Thu) 20 Jun 2013 [10:20:01], mdroth wrote:
 On Thu, Jun 20, 2013 at 10:12:30AM -0500, mdroth wrote:
  On Wed, Jun 19, 2013 at 01:17:57PM +0200, Laszlo Ersek wrote:
   Hello Michael,
   
   this is with reference to
   https://bugzilla.redhat.com/show_bug.cgi?id=907733.
   
   Ever since the initial qemu-ga commit AFAICS an exception for
   virtio-serial has existed, when reading EOF from the channel.
   
   For isa-serial, EOF results in the client connection being closed. I
   assume this exits the glib main loop somehow (otherwise qemu-ga would
   just sit there and do nothing, as no further connections are accepted I
   think).
  
  I think it would actually do the latter, unfortunately. It's distinct
  from virtio-serial handling in that we remove the GSource by return false
  in the event handler (qga/main.c:channel_event_cb), but I think we'd
  need to drop in a g_main_loop_quit() to actually get qemu-ga to exit in
  that scenario.
  
  This doesn't normally get triggered though, as isa-serial does not send
  EOF when nothing is connected to the chardev backend, but instead just
  blocks. Might till make sense to make qemu-ga exit in this case though
  since it won't be doing anything useful and wrapper scripts would at
  least have some indication that something is up.
  
   
   For a unix domain socket, we can continue accepting new connections
   after reading EOF.
   
   For virtio-serial, EOF means no host-side process is connected. In
   this case we sleep a bit and go back to reading from the same fd (and
   this turns into a sleep loop until the next host-side process connects).
   
   
   Can we tell virtio-serial port unplug from no host-side process?
   Because in the former case qemu-ga should really close the channel (and
   maybe exit (*)), otherwise the unplug won't complete in the guest kernel.

Port unplug will give an error return, and 'no host-side process' will
give EOF.  A bug was fixed recently in 3.11-rc5, and that fix is
queued for the -stable kernels as well.

Upstream kernel commit 96f97a83910cdb9d89d127c5ee523f8fc040a804

   According to Amit's comments in the RHBZ, at unplug a SIGIO is
   delivered, and a POLLHUP event is reported. However,
   
   (a) I think the glib iochannel abstraction doesn't allow us to tell
   POLLHUP apart from reading EOF;
  
  AFAICT we can actually access the POLLHUP event via the 'condition' param
  that gets passed to the cb, but the issue is we also get POLLUP when
  the chardev backend isn't open.
  
   
   (b) delivery of an unhandled SIGIO seems to terminate the victim
   process. qemu-ga doesn't seem to either catch or block SIGIO, which is
   why I think a SIGIO signal is not sent in reality (maybe we should ask
   for it first?)
   
   ... Actually I'm confused about this as well. The virtio-serial port
   *is* opened with O_ASYNC (and on Solaris, it is replaced with an
   I_SETSIG ioctl()). What's the reason for this? g_io_channel_unix_new()
   doesn't seem to list it as a requirement, and qemu-ga doesn't seem to
   handle SIGIO.
  
  At some point I played around with trying to use SIGIO to handle channel
  resets and whatnot (since we're also supposed to get one when someone
  opens the chardev backend and causes VIRTIO_CONSOLE_PORT_OPEN to get
  sent). I don't think I ever got it working, SIGIO doesn't seem to get
  sent, so that O_ASYNC might just be a relic from that.
  
  I tried installing a handler retested host-connect as well as hot
  unplug and still don't seem to be getting the signal. Not sure if i'm
  doing something wrong or if there's an issue with the guest driver.
  
  I did notice something interesting though:
  
  1371740628.596505: debug: cb: condition: 17, status: 2
  1371740628.596541: debug: received EOF
  1371740628.395726: debug: cb: condition: 17, status: 2
  1371740628.395760: debug: received EOF
  1371740628.496035: debug: cb: condition: 17, status: 2
  1371740628.496072: debug: received EOF
  1371740628.596505: debug: cb: condition: 17, status: 2
  1371740628.596541: debug: received EOF
  
  host opened chardev backend
  
  1371740634.195524: debug: cb: condition: 1, status: 1
  1371740634.195556: debug: read data, count: 25, data:
  {'execute':'guest-ping'}
  
  1371740634.195634: debug: process_event: called
  1371740634.195660: debug: processing command
  1371740634.196007: debug: sending data, count: 15
  
  virtio-serial unplugged
  
  1371740644.113346: debug: cb: condition: 16, status: 2
  1371740644.113379: debug: received EOF
  1371740644.213694: debug: cb: condition: 16, status: 2
  1371740644.213725: debug: received EOF
  1371740644.314041: debug: cb: condition: 16, status: 2
  1371740644.314168: debug: received EOF
  
  i.e. we got the POLLHUP if we read from an
  unconnected-but-present port, and we *don't* get the POLLHUP
  if the port has been unplugged.
 
 Er...silly me, POLLHUP=16, POLLIN=1 for glib, so I mixed them
 up. For unplugged case we get 

Re: [Qemu-devel] qemu-ga behavior on virtio-serial unplug

2013-06-20 Thread Amit Shah
On (Wed) 19 Jun 2013 [13:17:57], Laszlo Ersek wrote:
 Hello Michael,
 
 this is with reference to
 https://bugzilla.redhat.com/show_bug.cgi?id=907733.
 
 Ever since the initial qemu-ga commit AFAICS an exception for
 virtio-serial has existed, when reading EOF from the channel.
 
 For isa-serial, EOF results in the client connection being closed. I
 assume this exits the glib main loop somehow (otherwise qemu-ga would
 just sit there and do nothing, as no further connections are accepted I
 think).
 
 For a unix domain socket, we can continue accepting new connections
 after reading EOF.
 
 For virtio-serial, EOF means no host-side process is connected. In
 this case we sleep a bit and go back to reading from the same fd (and
 this turns into a sleep loop until the next host-side process connects).

There's also another bug here:
https://bugzilla.redhat.com/show_bug.cgi?id=975661
virtio_console: read() returns 0 after port unplug

write() calls return an error if a port has been unplugged, but read()
calls don't (in all cases -- see the link above for details).  I'm
going to fix this in kernel 3.11 (and possibly mark it for stable@).

 Can we tell virtio-serial port unplug from no host-side process?
 Because in the former case qemu-ga should really close the channel (and
 maybe exit (*)), otherwise the unplug won't complete in the guest kernel.

Yes, after the above fix, this will be the case.  read()s will return
error.

Also, poll(2) should return POLLHUP|POLLERR instead of just POLLHUP in
this case -- I'll fix that as well.

 According to Amit's comments in the RHBZ, at unplug a SIGIO is
 delivered, and a POLLHUP event is reported. However,
 
 (a) I think the glib iochannel abstraction doesn't allow us to tell
 POLLHUP apart from reading EOF;
 
 (b) delivery of an unhandled SIGIO seems to terminate the victim
 process. qemu-ga doesn't seem to either catch or block SIGIO, which is
 why I think a SIGIO signal is not sent in reality (maybe we should ask
 for it first?)

Bah, I saw another bug in the kernel, and you're right: the signal
isn't actually sent on unplug (it's only sent on host connecting and
disconnecting).  Fix on the way as well.

 ... Actually I'm confused about this as well. The virtio-serial port
 *is* opened with O_ASYNC (and on Solaris, it is replaced with an
 I_SETSIG ioctl()). What's the reason for this? g_io_channel_unix_new()
 doesn't seem to list it as a requirement, and qemu-ga doesn't seem to
 handle SIGIO.
 
 In any case we'd need a way to tell host side close from port unplug.

Will POLLHUP|POLLERR help, along with error returns on read() and
write()?

 (*) Then, there's the question what to do after unplug. Should we keep
 reopening the same virtio-serial port (and sleeping a bit in-between)?
 Or exit and let udev / systemd restart qemu-ga on any new virtio-serial
 port, passing -p accordingly?

I like the idea to defer to udev or systemd activation.

Amit



Re: [Qemu-devel] qemu-ga behavior on virtio-serial unplug

2013-06-20 Thread mdroth
On Wed, Jun 19, 2013 at 01:17:57PM +0200, Laszlo Ersek wrote:
 Hello Michael,
 
 this is with reference to
 https://bugzilla.redhat.com/show_bug.cgi?id=907733.
 
 Ever since the initial qemu-ga commit AFAICS an exception for
 virtio-serial has existed, when reading EOF from the channel.
 
 For isa-serial, EOF results in the client connection being closed. I
 assume this exits the glib main loop somehow (otherwise qemu-ga would
 just sit there and do nothing, as no further connections are accepted I
 think).

I think it would actually do the latter, unfortunately. It's distinct
from virtio-serial handling in that we remove the GSource by return false
in the event handler (qga/main.c:channel_event_cb), but I think we'd
need to drop in a g_main_loop_quit() to actually get qemu-ga to exit in
that scenario.

This doesn't normally get triggered though, as isa-serial does not send
EOF when nothing is connected to the chardev backend, but instead just
blocks. Might till make sense to make qemu-ga exit in this case though
since it won't be doing anything useful and wrapper scripts would at
least have some indication that something is up.

 
 For a unix domain socket, we can continue accepting new connections
 after reading EOF.
 
 For virtio-serial, EOF means no host-side process is connected. In
 this case we sleep a bit and go back to reading from the same fd (and
 this turns into a sleep loop until the next host-side process connects).
 
 
 Can we tell virtio-serial port unplug from no host-side process?
 Because in the former case qemu-ga should really close the channel (and
 maybe exit (*)), otherwise the unplug won't complete in the guest kernel.
 
 According to Amit's comments in the RHBZ, at unplug a SIGIO is
 delivered, and a POLLHUP event is reported. However,
 
 (a) I think the glib iochannel abstraction doesn't allow us to tell
 POLLHUP apart from reading EOF;

AFAICT we can actually access the POLLHUP event via the 'condition' param
that gets passed to the cb, but the issue is we also get POLLUP when
the chardev backend isn't open.

 
 (b) delivery of an unhandled SIGIO seems to terminate the victim
 process. qemu-ga doesn't seem to either catch or block SIGIO, which is
 why I think a SIGIO signal is not sent in reality (maybe we should ask
 for it first?)
 
 ... Actually I'm confused about this as well. The virtio-serial port
 *is* opened with O_ASYNC (and on Solaris, it is replaced with an
 I_SETSIG ioctl()). What's the reason for this? g_io_channel_unix_new()
 doesn't seem to list it as a requirement, and qemu-ga doesn't seem to
 handle SIGIO.

At some point I played around with trying to use SIGIO to handle channel
resets and whatnot (since we're also supposed to get one when someone
opens the chardev backend and causes VIRTIO_CONSOLE_PORT_OPEN to get
sent). I don't think I ever got it working, SIGIO doesn't seem to get
sent, so that O_ASYNC might just be a relic from that.

I tried installing a handler retested host-connect as well as hot
unplug and still don't seem to be getting the signal. Not sure if i'm
doing something wrong or if there's an issue with the guest driver.

I did notice something interesting though:

1371740628.596505: debug: cb: condition: 17, status: 2
1371740628.596541: debug: received EOF
1371740628.395726: debug: cb: condition: 17, status: 2
1371740628.395760: debug: received EOF
1371740628.496035: debug: cb: condition: 17, status: 2
1371740628.496072: debug: received EOF
1371740628.596505: debug: cb: condition: 17, status: 2
1371740628.596541: debug: received EOF

host opened chardev backend

1371740634.195524: debug: cb: condition: 1, status: 1
1371740634.195556: debug: read data, count: 25, data:
{'execute':'guest-ping'}

1371740634.195634: debug: process_event: called
1371740634.195660: debug: processing command
1371740634.196007: debug: sending data, count: 15

virtio-serial unplugged

1371740644.113346: debug: cb: condition: 16, status: 2
1371740644.113379: debug: received EOF
1371740644.213694: debug: cb: condition: 16, status: 2
1371740644.213725: debug: received EOF
1371740644.314041: debug: cb: condition: 16, status: 2
1371740644.314168: debug: received EOF

i.e. we got the POLLHUP if we read from an
unconnected-but-present port, and we *don't* get the POLLHUP
if the port has been unplugged.

And in none of these cases do the SIGIO seem to be sent.

Here's the debug stuff i added:

diff --git a/qga/main.c b/qga/main.c
index 0e04e73..7f9a628 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -140,6 +140,11 @@ static void quit_handler(int sig)
 }
 }
 
+static void sigio_handler(int sig)
+{
+g_debug(got sigio: %d, sig);
+}
+
 #ifndef _WIN32
 static gboolean register_signal_handlers(void)
 {
@@ -158,6 +163,13 @@ static gboolean register_signal_handlers(void)
 g_error(error configuring signal handler: %s, strerror(errno));
 }
 
+memset(sigact, 0, sizeof(struct sigaction));
+sigact.sa_handler = sigio_handler;
+ret = sigaction(SIGIO, sigact, 

Re: [Qemu-devel] qemu-ga behavior on virtio-serial unplug

2013-06-20 Thread mdroth
On Thu, Jun 20, 2013 at 10:12:30AM -0500, mdroth wrote:
 On Wed, Jun 19, 2013 at 01:17:57PM +0200, Laszlo Ersek wrote:
  Hello Michael,
  
  this is with reference to
  https://bugzilla.redhat.com/show_bug.cgi?id=907733.
  
  Ever since the initial qemu-ga commit AFAICS an exception for
  virtio-serial has existed, when reading EOF from the channel.
  
  For isa-serial, EOF results in the client connection being closed. I
  assume this exits the glib main loop somehow (otherwise qemu-ga would
  just sit there and do nothing, as no further connections are accepted I
  think).
 
 I think it would actually do the latter, unfortunately. It's distinct
 from virtio-serial handling in that we remove the GSource by return false
 in the event handler (qga/main.c:channel_event_cb), but I think we'd
 need to drop in a g_main_loop_quit() to actually get qemu-ga to exit in
 that scenario.
 
 This doesn't normally get triggered though, as isa-serial does not send
 EOF when nothing is connected to the chardev backend, but instead just
 blocks. Might till make sense to make qemu-ga exit in this case though
 since it won't be doing anything useful and wrapper scripts would at
 least have some indication that something is up.
 
  
  For a unix domain socket, we can continue accepting new connections
  after reading EOF.
  
  For virtio-serial, EOF means no host-side process is connected. In
  this case we sleep a bit and go back to reading from the same fd (and
  this turns into a sleep loop until the next host-side process connects).
  
  
  Can we tell virtio-serial port unplug from no host-side process?
  Because in the former case qemu-ga should really close the channel (and
  maybe exit (*)), otherwise the unplug won't complete in the guest kernel.
  
  According to Amit's comments in the RHBZ, at unplug a SIGIO is
  delivered, and a POLLHUP event is reported. However,
  
  (a) I think the glib iochannel abstraction doesn't allow us to tell
  POLLHUP apart from reading EOF;
 
 AFAICT we can actually access the POLLHUP event via the 'condition' param
 that gets passed to the cb, but the issue is we also get POLLUP when
 the chardev backend isn't open.
 
  
  (b) delivery of an unhandled SIGIO seems to terminate the victim
  process. qemu-ga doesn't seem to either catch or block SIGIO, which is
  why I think a SIGIO signal is not sent in reality (maybe we should ask
  for it first?)
  
  ... Actually I'm confused about this as well. The virtio-serial port
  *is* opened with O_ASYNC (and on Solaris, it is replaced with an
  I_SETSIG ioctl()). What's the reason for this? g_io_channel_unix_new()
  doesn't seem to list it as a requirement, and qemu-ga doesn't seem to
  handle SIGIO.
 
 At some point I played around with trying to use SIGIO to handle channel
 resets and whatnot (since we're also supposed to get one when someone
 opens the chardev backend and causes VIRTIO_CONSOLE_PORT_OPEN to get
 sent). I don't think I ever got it working, SIGIO doesn't seem to get
 sent, so that O_ASYNC might just be a relic from that.
 
 I tried installing a handler retested host-connect as well as hot
 unplug and still don't seem to be getting the signal. Not sure if i'm
 doing something wrong or if there's an issue with the guest driver.
 
 I did notice something interesting though:
 
 1371740628.596505: debug: cb: condition: 17, status: 2
 1371740628.596541: debug: received EOF
 1371740628.395726: debug: cb: condition: 17, status: 2
 1371740628.395760: debug: received EOF
 1371740628.496035: debug: cb: condition: 17, status: 2
 1371740628.496072: debug: received EOF
 1371740628.596505: debug: cb: condition: 17, status: 2
 1371740628.596541: debug: received EOF
 
 host opened chardev backend
 
 1371740634.195524: debug: cb: condition: 1, status: 1
 1371740634.195556: debug: read data, count: 25, data:
 {'execute':'guest-ping'}
 
 1371740634.195634: debug: process_event: called
 1371740634.195660: debug: processing command
 1371740634.196007: debug: sending data, count: 15
 
 virtio-serial unplugged
 
 1371740644.113346: debug: cb: condition: 16, status: 2
 1371740644.113379: debug: received EOF
 1371740644.213694: debug: cb: condition: 16, status: 2
 1371740644.213725: debug: received EOF
 1371740644.314041: debug: cb: condition: 16, status: 2
 1371740644.314168: debug: received EOF
 
 i.e. we got the POLLHUP if we read from an
 unconnected-but-present port, and we *don't* get the POLLHUP
 if the port has been unplugged.

Er...silly me, POLLHUP=16, POLLIN=1 for glib, so I mixed them
up. For unplugged case we get POLLHUP, for unconnected case we get
POLLIN | POLLHUP, so that might actually be enough to distinguish
unplug if this is the intended behavior.

Amit, can you confirm?

 
 And in none of these cases do the SIGIO seem to be sent.
 
 Here's the debug stuff i added:
 
 diff --git a/qga/main.c b/qga/main.c
 index 0e04e73..7f9a628 100644
 --- a/qga/main.c
 +++ b/qga/main.c
 @@ -140,6 +140,11 @@ static void quit_handler(int sig)
  

Re: [Qemu-devel] qemu-ga behavior on virtio-serial unplug

2013-06-20 Thread Laszlo Ersek
On 06/20/13 15:31, Amit Shah wrote:
 On (Wed) 19 Jun 2013 [13:17:57], Laszlo Ersek wrote:

 In any case we'd need a way to tell host side close from port unplug.
 
 Will POLLHUP|POLLERR help, along with error returns on read() and
 write()?

I think so:
- read() == 0  -- host side disconnected,
- read() == -1, equivalently POLLERR -- unplug (or other error)
- write() == -1 / errno == EPIPE: host side disconnected,
- write() == -1 / errno == EIO (or ENXIO or EINVAL): unplug

I think the current code could be adapted to such a scheme gracefully.
(Regarding the error codes on write(), I just made them up, but you get
the idea.)

On hot-unplug we could bail out to a more external loop that tries to
reopen the same device, or just exit and leave the restart to udev/systemd.


If possible I would like to avoid SIGIO. SIGIO is basically a non-queued
(= can be pending or not pending; any realtime queueing variant is
limited in depth hence useless) and edge triggered readiness
notification. It isn't portable and requires extra hoops to jump through
just to get the file descriptor and to tell a read event from a write event.

Although it's possible to base level triggered readiness on top of it
(by setting read  write readiness booleans on SIGIO and clearing them
on the respective -1/EAGAIN, while blocking SIGIO carefully), I think
that's quite distant from our current event loop. Hence we should remove
O_ASYNC (and the Solaris equivalent too) with the same fell swoop.

Thanks
Laszlo



[Qemu-devel] qemu-ga behavior on virtio-serial unplug

2013-06-19 Thread Laszlo Ersek
Hello Michael,

this is with reference to
https://bugzilla.redhat.com/show_bug.cgi?id=907733.

Ever since the initial qemu-ga commit AFAICS an exception for
virtio-serial has existed, when reading EOF from the channel.

For isa-serial, EOF results in the client connection being closed. I
assume this exits the glib main loop somehow (otherwise qemu-ga would
just sit there and do nothing, as no further connections are accepted I
think).

For a unix domain socket, we can continue accepting new connections
after reading EOF.

For virtio-serial, EOF means no host-side process is connected. In
this case we sleep a bit and go back to reading from the same fd (and
this turns into a sleep loop until the next host-side process connects).


Can we tell virtio-serial port unplug from no host-side process?
Because in the former case qemu-ga should really close the channel (and
maybe exit (*)), otherwise the unplug won't complete in the guest kernel.

According to Amit's comments in the RHBZ, at unplug a SIGIO is
delivered, and a POLLHUP event is reported. However,

(a) I think the glib iochannel abstraction doesn't allow us to tell
POLLHUP apart from reading EOF;

(b) delivery of an unhandled SIGIO seems to terminate the victim
process. qemu-ga doesn't seem to either catch or block SIGIO, which is
why I think a SIGIO signal is not sent in reality (maybe we should ask
for it first?)

... Actually I'm confused about this as well. The virtio-serial port
*is* opened with O_ASYNC (and on Solaris, it is replaced with an
I_SETSIG ioctl()). What's the reason for this? g_io_channel_unix_new()
doesn't seem to list it as a requirement, and qemu-ga doesn't seem to
handle SIGIO.

In any case we'd need a way to tell host side close from port unplug.

(*) Then, there's the question what to do after unplug. Should we keep
reopening the same virtio-serial port (and sleeping a bit in-between)?
Or exit and let udev / systemd restart qemu-ga on any new virtio-serial
port, passing -p accordingly?

Thanks
Laszlo