Re: [PATCH 2 of 3] chg: start server at a unique address
Excerpts from timeless's message of 2016-12-18 23:52:05 -0500: > Jun Wu wrote: > > Pid namespace breaks mercurial lock, which is a symlink with the content > > "host:pid". > > Would adjusting the format to: > host:pid:starttime:boottime fix this? No. A pid number in a pid namespace is highly likely to be invalid in another pid ns. So hg just thinks the process who held the lock is dead, and ignores the lock. > > Sadly, it seems like process start time is actually stored as jiffies > since system boot on Linux, and NTP can make canonicalizing that value > /somewhat/ flaky [1]. Hopefully, there'd be some way to make this work > > [1] > http://linuxcommando.blogspot.ca/2008/09/how-to-get-process-start-date-and-time.html > ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 3] chg: start server at a unique address
Jun Wu wrote: > Pid namespace breaks mercurial lock, which is a symlink with the content > "host:pid". Would adjusting the format to: host:pid:starttime:boottime fix this? Sadly, it seems like process start time is actually stored as jiffies since system boot on Linux, and NTP can make canonicalizing that value /somewhat/ flaky [1]. Hopefully, there'd be some way to make this work [1] http://linuxcommando.blogspot.ca/2008/09/how-to-get-process-start-date-and-time.html ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 3] chg: start server at a unique address
Excerpts from Gregory Szorc's message of 2016-12-17 16:40:12 -0800: > It's probably not worth worrying about just yet, but with Linux PID > namespaces, multiple processes may think they have the same PID, even if that > PID maps to different values inside the kernel. > > Mozilla has encountered problems with multiple hg processes running in > separate containers and PID namespaces acquiring the same lock from a shared > filesystem simultaneously because multiple hg processes share the same PID > and hostname in different "containers." Pid namespace breaks mercurial lock, which is a symlink with the content "host:pid". Note that chg lock is not related to correctness. Without lock, chg also behaves correctly, with the downside: 1. More redirections. 2. Spawn servers with a same config hash in parallel unnecessarily. This series resolves 1. But that's just "nice-to-have". Even if we just drop locks today, chg won't have correctness issues. > > > On Dec 16, 2016, at 17:49, Jun Wuwrote: > > > > # HG changeset patch > > # User Jun Wu > > # Date 1481938472 0 > > # Sat Dec 17 01:34:32 2016 + > > # Node ID 6c9ce8399350d8287599cd802b91adf73db08759 > > # Parent 69d25b06467d65bf6d1e85d34d8fc57ec321b51d > > # Available At https://bitbucket.org/quark-zju/hg-draft > > # hg pull https://bitbucket.org/quark-zju/hg-draft -r > > 6c9ce8399350 > > chg: start server at a unique address > > > > See the previous patch for motivation. Previously, the server is started at > > a globally shared address, which requires a lock. This patch appends pid to > > the address so it becomes unique. > > > > diff --git a/contrib/chg/chg.c b/contrib/chg/chg.c > > --- a/contrib/chg/chg.c > > +++ b/contrib/chg/chg.c > > @@ -32,4 +32,5 @@ > > struct cmdserveropts { > >char sockname[UNIX_PATH_MAX]; > > +char initsockname[UNIX_PATH_MAX]; > >char redirectsockname[UNIX_PATH_MAX]; > >char lockfile[UNIX_PATH_MAX]; > > @@ -164,4 +165,8 @@ static void setcmdserveropts(struct cmds > >if (r < 0 || (size_t)r >= sizeof(opts->lockfile)) > >abortmsg("too long TMPDIR or CHGSOCKNAME (r = %d)", r); > > +r = snprintf(opts->initsockname, sizeof(opts->initsockname), > > +"%s.%u", opts->sockname, (unsigned)getpid()); > > +if (r < 0 || (size_t)r >= sizeof(opts->initsockname)) > > +abortmsg("too long TMPDIR or CHGSOCKNAME (r = %d)", r); > > } > > > > @@ -224,5 +229,5 @@ static void execcmdserver(const struct c > >"serve", > >"--cmdserver", "chgunix", > > -"--address", opts->sockname, > > +"--address", opts->initsockname, > >"--daemon-postexec", "chdir:/", > >}; > > @@ -248,5 +253,5 @@ static hgclient_t *retryconnectcmdserver > >int pst = 0; > > > > -debugmsg("try connect to %s repeatedly", opts->sockname); > > +debugmsg("try connect to %s repeatedly", opts->initsockname); > > > >unsigned int timeoutsec = 60; /* default: 60 seconds */ > > @@ -256,7 +261,13 @@ static hgclient_t *retryconnectcmdserver > > > >for (unsigned int i = 0; !timeoutsec || i < timeoutsec * 100; i++) { > > -hgclient_t *hgc = hgc_open(opts->sockname); > > -if (hgc) > > +hgclient_t *hgc = hgc_open(opts->initsockname); > > +if (hgc) { > > +debugmsg("rename %s to %s", opts->initsockname, > > +opts->sockname); > > +int r = rename(opts->initsockname, opts->sockname); > > +if (r != 0) > > +abortmsgerrno("cannot rename"); > >return hgc; > > +} > > > >if (pid > 0) { > > @@ -270,5 +281,5 @@ static hgclient_t *retryconnectcmdserver > >} > > > > -abortmsg("timed out waiting for cmdserver %s", opts->sockname); > > +abortmsg("timed out waiting for cmdserver %s", opts->initsockname); > >return NULL; > > > > @@ -313,5 +324,5 @@ static hgclient_t *connectcmdserver(stru > >unlink(opts->sockname); > > > > -debugmsg("start cmdserver at %s", opts->sockname); > > +debugmsg("start cmdserver at %s", opts->initsockname); > > > >pid_t pid = fork(); > > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 3] chg: start server at a unique address
On Sat, 17 Dec 2016 16:40:12 -0800, Gregory Szorc wrote: > It's probably not worth worrying about just yet, but with Linux PID > namespaces, multiple processes may think they have the same PID, even if that > PID maps to different values inside the kernel. > > Mozilla has encountered problems with multiple hg processes running in > separate containers and PID namespaces acquiring the same lock from a shared > filesystem simultaneously because multiple hg processes share the same PID > and hostname in different "containers." We wouldn't need to care much about that for chg, since there would be little benefit to mount /tmp as a shared filesystem (and we have a somewhat similar issue in the current code which uses flock.) The overall changes look good to me. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 3] chg: start server at a unique address
It's probably not worth worrying about just yet, but with Linux PID namespaces, multiple processes may think they have the same PID, even if that PID maps to different values inside the kernel. Mozilla has encountered problems with multiple hg processes running in separate containers and PID namespaces acquiring the same lock from a shared filesystem simultaneously because multiple hg processes share the same PID and hostname in different "containers." > On Dec 16, 2016, at 17:49, Jun Wuwrote: > > # HG changeset patch > # User Jun Wu > # Date 1481938472 0 > # Sat Dec 17 01:34:32 2016 + > # Node ID 6c9ce8399350d8287599cd802b91adf73db08759 > # Parent 69d25b06467d65bf6d1e85d34d8fc57ec321b51d > # Available At https://bitbucket.org/quark-zju/hg-draft > # hg pull https://bitbucket.org/quark-zju/hg-draft -r > 6c9ce8399350 > chg: start server at a unique address > > See the previous patch for motivation. Previously, the server is started at > a globally shared address, which requires a lock. This patch appends pid to > the address so it becomes unique. > > diff --git a/contrib/chg/chg.c b/contrib/chg/chg.c > --- a/contrib/chg/chg.c > +++ b/contrib/chg/chg.c > @@ -32,4 +32,5 @@ > struct cmdserveropts { >char sockname[UNIX_PATH_MAX]; > +char initsockname[UNIX_PATH_MAX]; >char redirectsockname[UNIX_PATH_MAX]; >char lockfile[UNIX_PATH_MAX]; > @@ -164,4 +165,8 @@ static void setcmdserveropts(struct cmds >if (r < 0 || (size_t)r >= sizeof(opts->lockfile)) >abortmsg("too long TMPDIR or CHGSOCKNAME (r = %d)", r); > +r = snprintf(opts->initsockname, sizeof(opts->initsockname), > +"%s.%u", opts->sockname, (unsigned)getpid()); > +if (r < 0 || (size_t)r >= sizeof(opts->initsockname)) > +abortmsg("too long TMPDIR or CHGSOCKNAME (r = %d)", r); > } > > @@ -224,5 +229,5 @@ static void execcmdserver(const struct c >"serve", >"--cmdserver", "chgunix", > -"--address", opts->sockname, > +"--address", opts->initsockname, >"--daemon-postexec", "chdir:/", >}; > @@ -248,5 +253,5 @@ static hgclient_t *retryconnectcmdserver >int pst = 0; > > -debugmsg("try connect to %s repeatedly", opts->sockname); > +debugmsg("try connect to %s repeatedly", opts->initsockname); > >unsigned int timeoutsec = 60; /* default: 60 seconds */ > @@ -256,7 +261,13 @@ static hgclient_t *retryconnectcmdserver > >for (unsigned int i = 0; !timeoutsec || i < timeoutsec * 100; i++) { > -hgclient_t *hgc = hgc_open(opts->sockname); > -if (hgc) > +hgclient_t *hgc = hgc_open(opts->initsockname); > +if (hgc) { > +debugmsg("rename %s to %s", opts->initsockname, > +opts->sockname); > +int r = rename(opts->initsockname, opts->sockname); > +if (r != 0) > +abortmsgerrno("cannot rename"); >return hgc; > +} > >if (pid > 0) { > @@ -270,5 +281,5 @@ static hgclient_t *retryconnectcmdserver >} > > -abortmsg("timed out waiting for cmdserver %s", opts->sockname); > +abortmsg("timed out waiting for cmdserver %s", opts->initsockname); >return NULL; > > @@ -313,5 +324,5 @@ static hgclient_t *connectcmdserver(stru >unlink(opts->sockname); > > -debugmsg("start cmdserver at %s", opts->sockname); > +debugmsg("start cmdserver at %s", opts->initsockname); > >pid_t pid = fork(); > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 2 of 3] chg: start server at a unique address
# HG changeset patch # User Jun Wu# Date 1481938472 0 # Sat Dec 17 01:34:32 2016 + # Node ID 6c9ce8399350d8287599cd802b91adf73db08759 # Parent 69d25b06467d65bf6d1e85d34d8fc57ec321b51d # Available At https://bitbucket.org/quark-zju/hg-draft # hg pull https://bitbucket.org/quark-zju/hg-draft -r 6c9ce8399350 chg: start server at a unique address See the previous patch for motivation. Previously, the server is started at a globally shared address, which requires a lock. This patch appends pid to the address so it becomes unique. diff --git a/contrib/chg/chg.c b/contrib/chg/chg.c --- a/contrib/chg/chg.c +++ b/contrib/chg/chg.c @@ -32,4 +32,5 @@ struct cmdserveropts { char sockname[UNIX_PATH_MAX]; + char initsockname[UNIX_PATH_MAX]; char redirectsockname[UNIX_PATH_MAX]; char lockfile[UNIX_PATH_MAX]; @@ -164,4 +165,8 @@ static void setcmdserveropts(struct cmds if (r < 0 || (size_t)r >= sizeof(opts->lockfile)) abortmsg("too long TMPDIR or CHGSOCKNAME (r = %d)", r); + r = snprintf(opts->initsockname, sizeof(opts->initsockname), + "%s.%u", opts->sockname, (unsigned)getpid()); + if (r < 0 || (size_t)r >= sizeof(opts->initsockname)) + abortmsg("too long TMPDIR or CHGSOCKNAME (r = %d)", r); } @@ -224,5 +229,5 @@ static void execcmdserver(const struct c "serve", "--cmdserver", "chgunix", - "--address", opts->sockname, + "--address", opts->initsockname, "--daemon-postexec", "chdir:/", }; @@ -248,5 +253,5 @@ static hgclient_t *retryconnectcmdserver int pst = 0; - debugmsg("try connect to %s repeatedly", opts->sockname); + debugmsg("try connect to %s repeatedly", opts->initsockname); unsigned int timeoutsec = 60; /* default: 60 seconds */ @@ -256,7 +261,13 @@ static hgclient_t *retryconnectcmdserver for (unsigned int i = 0; !timeoutsec || i < timeoutsec * 100; i++) { - hgclient_t *hgc = hgc_open(opts->sockname); - if (hgc) + hgclient_t *hgc = hgc_open(opts->initsockname); + if (hgc) { + debugmsg("rename %s to %s", opts->initsockname, + opts->sockname); + int r = rename(opts->initsockname, opts->sockname); + if (r != 0) + abortmsgerrno("cannot rename"); return hgc; + } if (pid > 0) { @@ -270,5 +281,5 @@ static hgclient_t *retryconnectcmdserver } - abortmsg("timed out waiting for cmdserver %s", opts->sockname); + abortmsg("timed out waiting for cmdserver %s", opts->initsockname); return NULL; @@ -313,5 +324,5 @@ static hgclient_t *connectcmdserver(stru unlink(opts->sockname); - debugmsg("start cmdserver at %s", opts->sockname); + debugmsg("start cmdserver at %s", opts->initsockname); pid_t pid = fork(); ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel