On Thu, Apr 02, 2015 at 06:21:01PM +0200, Michal Privoznik wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1200149 > > Even though we have a mutex mechanism so that two clients don't spawn > two daemons, it's not strong enough. It can happen that while one > client is spawning the daemon, the other one fails to connect. > Basically two possible errors can happen: > > error: Failed to connect socket to > '/home/mprivozn/.cache/libvirt/libvirt-sock': Connection refused > > or: > > error: Failed to connect socket to > '/home/mprivozn/.cache/libvirt/libvirt-sock': No such file or directory > > The problem in both cases is, the daemon is only starting up, while we > are trying to connect (and fail). We should postpone the connecting > phase until the daemon is started (by the other thread that is > spawning it). In order to do that, create a file lock 'libvirt-lock' > in the directory where session daemon would create its socket. So even > when called from multiple processes, spawning a daemon will serialize > on the file lock. So only the first to come will spawn the daemon. > > Signed-off-by: Michal Privoznik <[email protected]> > --- > > diff to v1: > -hopefully this one is race free > > src/rpc/virnetsocket.c | 148 > ++++++++++++------------------------------------- > 1 file changed, 36 insertions(+), 112 deletions(-)
ACK, with a few changes to the lock file path
>
> diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
> index 6b019cc..2b891ae 100644
> --- a/src/rpc/virnetsocket.c
> +++ b/src/rpc/virnetsocket.c
> @@ -543,10 +539,10 @@ int virNetSocketNewConnectUNIX(const char *path,
> const char *binary,
> virNetSocketPtr *retsock)
> {
> - char *binname = NULL;
> - char *pidpath = NULL;
> - int fd, passfd = -1;
> - int pidfd = -1;
> + char *lockpath = NULL;
> + int lockfd = -1;
> + int fd = -1;
> + int retries = 100;
> virSocketAddr localAddr;
> virSocketAddr remoteAddr;
>
> @@ -561,6 +557,22 @@ int virNetSocketNewConnectUNIX(const char *path,
> return -1;
> }
>
> + if (spawnDaemon) {
> + if (virPidFileConstructPath(false, NULL, "libvirt-lock", &lockpath)
> < 0)
> + goto error;
Here the lock gets named: .cache/libvirt/libvirt-lock.pid, even though it does
not
contain a pid and we could be using this function to spawn virlockd too, not
just libvirtd.
Can you move the last_component call here and name it "%s.lock",
binname?
Jan
signature.asc
Description: Digital signature
-- libvir-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/libvir-list
