On Thu, 24.04.14 23:51, Zbigniew Jędrzejewski-Szmek ([email protected]) wrote:
> > On Wed, Apr 23, 2014 at 08:50:34PM +0200, Lennart Poettering wrote: > > On Wed, 23.04.14 15:15, Eelco Dolstra ([email protected]) wrote: > > > > > Hi all, > > > > > > I've noticed that the command "systemd-notify --ready" does not work > > > reliably to > > > signal that a service is ready. It works sometimes, but most of the time > > > you get > > > a message like: > > > > > > systemd[1]: Cannot find unit for notify message of PID 3137. > > > > > > in the journal, and the service stays in the "activating" state. > > > > > > The reason is that systemd-notify sends its message asynchronously and > > > exits > > > immediately. So by the time systemd processes the message, systemd-notify > > > has > > > probably already exited, and so systemd cannot gets cgroup. (Note that > > > this > > > affects other systemd-notify messages as well, but for --ready it's > > > particularly > > > bad because it causes services to "hang" in the "activating" state.) > > > > > > Any suggestions what to do about this? I can see a few solutions: > > > > There is ongoing work to fix the kernel to add SCM_CGROUPS for us to > > messages. With that in place we have a race-free way to get this data > > for incoming messages. I have some hopes that this will soonishly enter > > the kernel, but then again, this story has been cookie for the past 5 > > years to no successs... > What about simply waiting in the background for 10s? An ugly workaround, > but should fix the issue until we have something better. What about this idea instead+: Instead of sending the datagram with our own PID in the ucred field we could simply try to override it with our parent's PID. This will not fix 100% of the cases, but I am quite sure it will fix most, since the parent process is usually the one that stays around and if you want to send READY=1, then you are likely to stay around for longer, so that the parent PID should be good enough. I am pretty sure we should make this change, regardless whether it fixes all or only a part of the cases, simply because I think it is the right thing to do, after all we also send MAINPID= of the paren PID, instead of our own, for a reason... Does this make sense? Lennart -- Lennart Poettering, Red Hat _______________________________________________ systemd-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/systemd-devel
