Re: [libvirt] [PATCH v3 2/4] qemu: support passing pre-opened UNIX socket listen FD

2018-06-05 Thread Daniel P . Berrangé
On Thu, May 24, 2018 at 04:50:36PM -0500, Eric Blake wrote:
> On 05/17/2018 08:40 AM, Daniel P. Berrangé wrote:
> > There is a race condition when spawning QEMU where libvirt has spawned
> > QEMU but the monitor socket is not yet open. Libvirt has to repeatedly
> > try to connect() to QEMU's monitor until eventually it succeeds, or
> > times out. We use kill() to check if QEMU is still alive so we avoid
> > waiting a long time if QEMU exited, but having a timeout at all is still
> > unpleasant.
> > 
> > With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP
> > sockets. If libvirt has called bind() and listen() on this FD, then we
> > have a guarantee that libvirt can immediately call connect() and
> > succeed without any race.
> > 
> > Although we only really care about this for the monitor socket and agent
> > socket, this patch does FD passing for all UNIX socket based character
> > devices since there appears to be no downside to it.
> > 
> > We don't do FD passing for TCP sockets, however, because it is only
> > possible to pass a single FD, while some hostnames may require listening
> > on multiple FDs to cover IPv4 and IPv6 concurrently.
> > 
> 
> > +++ b/tests/qemuxml2argvmock.c
> 
> > +int
> > +qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev 
> > ATTRIBUTE_UNUSED)
> > +
> > +{
> > +/* We need to return an FD number for a UNIX listener socket,
> > + * which will be given to QEMU via a CLI arg. We need a fixed
> > + * number to get stable tests. This is obviously not a real
> > + * FD number, so when virCommand closes the FD in the parent
> > + * it will get EINVAL, but that's (hopefully) not going to
> > + * be a problem
> > + */
> > +return 1729;
> 
> Is it worth asserting that 1729 is not an open fd (perhaps by fcntl(1729,
> F_GETFD) == -1) because of something strange in the test environment?

Sure I'll add

diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c
index 36d25dfc3f..56a6d4892b 100644
--- a/tests/qemuxml2argvmock.c
+++ b/tests/qemuxml2argvmock.c
@@ -228,5 +228,7 @@ qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef 
*dev ATTRIBUTE_UNUSED)
  * it will get EINVAL, but that's (hopefully) not going to
  * be a problem
  */
+if (fcntl(1729, F_GETFD) != -1)
+abort();
 return 1729;
 }

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v3 2/4] qemu: support passing pre-opened UNIX socket listen FD

2018-05-24 Thread Eric Blake

On 05/17/2018 08:40 AM, Daniel P. Berrangé wrote:

There is a race condition when spawning QEMU where libvirt has spawned
QEMU but the monitor socket is not yet open. Libvirt has to repeatedly
try to connect() to QEMU's monitor until eventually it succeeds, or
times out. We use kill() to check if QEMU is still alive so we avoid
waiting a long time if QEMU exited, but having a timeout at all is still
unpleasant.

With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP
sockets. If libvirt has called bind() and listen() on this FD, then we
have a guarantee that libvirt can immediately call connect() and
succeed without any race.

Although we only really care about this for the monitor socket and agent
socket, this patch does FD passing for all UNIX socket based character
devices since there appears to be no downside to it.

We don't do FD passing for TCP sockets, however, because it is only
possible to pass a single FD, while some hostnames may require listening
on multiple FDs to cover IPv4 and IPv6 concurrently.




+++ b/tests/qemuxml2argvmock.c



+int
+qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev ATTRIBUTE_UNUSED)
+
+{
+/* We need to return an FD number for a UNIX listener socket,
+ * which will be given to QEMU via a CLI arg. We need a fixed
+ * number to get stable tests. This is obviously not a real
+ * FD number, so when virCommand closes the FD in the parent
+ * it will get EINVAL, but that's (hopefully) not going to
+ * be a problem
+ */
+return 1729;


Is it worth asserting that 1729 is not an open fd (perhaps by 
fcntl(1729, F_GETFD) == -1) because of something strange in the test 
environment?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v3 2/4] qemu: support passing pre-opened UNIX socket listen FD

2018-05-22 Thread John Ferlan


On 05/17/2018 09:40 AM, Daniel P. Berrangé wrote:
> There is a race condition when spawning QEMU where libvirt has spawned
> QEMU but the monitor socket is not yet open. Libvirt has to repeatedly
> try to connect() to QEMU's monitor until eventually it succeeds, or
> times out. We use kill() to check if QEMU is still alive so we avoid
> waiting a long time if QEMU exited, but having a timeout at all is still
> unpleasant.
> 
> With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP
> sockets. If libvirt has called bind() and listen() on this FD, then we
> have a guarantee that libvirt can immediately call connect() and
> succeed without any race.
> 
> Although we only really care about this for the monitor socket and agent
> socket, this patch does FD passing for all UNIX socket based character
> devices since there appears to be no downside to it.
> 
> We don't do FD passing for TCP sockets, however, because it is only
> possible to pass a single FD, while some hostnames may require listening
> on multiple FDs to cover IPv4 and IPv6 concurrently.
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  src/qemu/qemu_command.c   | 64 ++-
>  src/qemu/qemu_command.h   |  4 ++
>  .../disk-drive-write-cache.x86_64-latest.args |  3 +-
>  ...irtio-scsi-reservations.x86_64-latest.args |  3 +-
>  tests/qemuxml2argvmock.c  | 16 +
>  5 files changed, 84 insertions(+), 6 deletions(-)
> 

Using a mocked socket number seems to be a reasonable mechanism to
achieve the goal. There's certainly other tests that used mocked paths
or results to get a standard result/answer.

Reviewed-by: John Ferlan 

John


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v3 2/4] qemu: support passing pre-opened UNIX socket listen FD

2018-05-17 Thread Daniel P . Berrangé
There is a race condition when spawning QEMU where libvirt has spawned
QEMU but the monitor socket is not yet open. Libvirt has to repeatedly
try to connect() to QEMU's monitor until eventually it succeeds, or
times out. We use kill() to check if QEMU is still alive so we avoid
waiting a long time if QEMU exited, but having a timeout at all is still
unpleasant.

With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP
sockets. If libvirt has called bind() and listen() on this FD, then we
have a guarantee that libvirt can immediately call connect() and
succeed without any race.

Although we only really care about this for the monitor socket and agent
socket, this patch does FD passing for all UNIX socket based character
devices since there appears to be no downside to it.

We don't do FD passing for TCP sockets, however, because it is only
possible to pass a single FD, while some hostnames may require listening
on multiple FDs to cover IPv4 and IPv6 concurrently.

Signed-off-by: Daniel P. Berrangé 
---
 src/qemu/qemu_command.c   | 64 ++-
 src/qemu/qemu_command.h   |  4 ++
 .../disk-drive-write-cache.x86_64-latest.args |  3 +-
 ...irtio-scsi-reservations.x86_64-latest.args |  3 +-
 tests/qemuxml2argvmock.c  | 16 +
 5 files changed, 84 insertions(+), 6 deletions(-)

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index c4237339bf..6834480e1f 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -4913,6 +4913,56 @@ qemuBuildChrChardevReconnectStr(virBufferPtr buf,
 }
 
 
+int
+qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev)
+{
+struct sockaddr_un addr;
+socklen_t addrlen = sizeof(addr);
+int fd;
+
+if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
+virReportSystemError(errno, "%s",
+ _("Unable to create UNIX socket"));
+goto error;
+}
+
+memset(, 0, sizeof(addr));
+addr.sun_family = AF_UNIX;
+if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) == NULL) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("UNIX socket path '%s' too long"),
+   dev->data.nix.path);
+goto error;
+}
+
+if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) {
+virReportSystemError(errno,
+ _("Unable to unlink %s"),
+ dev->data.nix.path);
+goto error;
+}
+
+if (bind(fd, (struct sockaddr *), addrlen) < 0) {
+virReportSystemError(errno,
+ _("Unable to bind to UNIX socket path '%s'"),
+ dev->data.nix.path);
+goto error;
+}
+
+if (listen(fd, 1) < 0) {
+virReportSystemError(errno,
+ _("Unable to listen to UNIX socket path '%s'"),
+ dev->data.nix.path);
+goto error;
+}
+
+return fd;
+
+ error:
+VIR_FORCE_CLOSE(fd);
+return -1;
+}
+
 /* This function outputs a -chardev command line option which describes only 
the
  * host side of the character device */
 static char *
@@ -5042,8 +5092,18 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager,
 break;
 
 case VIR_DOMAIN_CHR_TYPE_UNIX:
-virBufferAsprintf(, "socket,id=%s,path=", charAlias);
-virQEMUBuildBufferEscapeComma(, dev->data.nix.path);
+if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) {
+int fd = qemuOpenChrChardevUNIXSocket(dev);
+if (fd < 0)
+goto cleanup;
+
+virBufferAsprintf(, "socket,id=%s,fd=%d", charAlias, fd);
+
+virCommandPassFD(cmd, fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT);
+} else {
+virBufferAsprintf(, "socket,id=%s,path=", charAlias);
+virQEMUBuildBufferEscapeComma(, dev->data.nix.path);
+}
 if (dev->data.nix.listen)
 virBufferAdd(, nowait ? ",server,nowait" : ",server", -1);
 
diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h
index 28bc33558b..f722b1be72 100644
--- a/src/qemu/qemu_command.h
+++ b/src/qemu/qemu_command.h
@@ -70,6 +70,10 @@ int qemuBuildTLSx509BackendProps(const char *tlspath,
  virQEMUCapsPtr qemuCaps,
  virJSONValuePtr *propsret);
 
+/* Open a UNIX socket for chardev FD passing */
+int
+qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev);
+
 /* Generate '-device' string for chardev device */
 int
 qemuBuildChrDeviceStr(char **deviceStr,
diff --git a/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args 
b/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args
index a63c5b7477..9e5b611351 100644
--- a/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args
+++ b/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args
@@ -17,8 +17,7