Re: initialization vs supervision
* Laurent Bercot ska-supervis...@skarnet.org [20140727 00:53]: On 26/07/2014 20:47, Joan Picanyol i Puig wrote: What tricky responsabilities are you thinking of for /sbin/init that would make it Linux specific? s6-svscan wants a read-write directory to run in, and another to run its logger in. I definitely want to support read-only root filesystems, and it's too early to mount disks, so a tmpfs has to be created - this is system-dependent. If you have a way to do that in a system-agnostic way, I'm *very* interested. :) No magic wand here, I just see it as scripts all the way down... Creating and mounting a tmpfs and bringing up the network look pretty much the same to me: initialization tasks orchestrated by scripts invoking userland binaries. At this point in booting, if something fails it's watchdog time... Regarding read-only root fs, isn't it just a matter of restarting svscan's logger pointing to stable storage once it is up running? qvb -- pica
Re: Holidays Brainstorming: instanced supervision
* Laurent Bercot[20151226 12:33]: > In the past few years, there have been some bits and pieces of > discussion about "instanced services", i.e. some kind of supervised > service that would be able to create different instances of the > process at will. But it never got very detailed. > > I'd like to ask you: if you feel that instanced services are > useful, what functionality would you like to see ? Concretely, > practically, what do you want to be able to do ? > > Please stay away from mechanism and implementation details; > I'm just trying to get a very high-level, conceptual feel for it > at the moment - "what", as opposed to "how". FWIW "what" I've needed when deploying "multiple instances of a service" has always been diferent configuration for the same binaries. "How" I have usually done it has been via env/ symlinks (storing in env/ anything from variables for envdir or application-specific configuration files). qvb -- pica
Re: runit kill runsv
* Laurent Bercot[20160623 14:20]: > On 23/06/2016 03:46, Thomas Lau wrote: > >LOL, well I am trying to do drill test and see how resilience of runit > >could be, this is one of the minor downfall. > > Current supervisors have no way of knowing that they died and > their child is still running. However, couldn't they know whether their child did not cease to run because of a signal they sent? [...] > - Any attempt to kill the old instance of the daemon in order to properly > start a new supervised instance is a policy decision, which belongs to the > admin; the supervisor program can't make that decision automatically. No, but neither can the admin enforce this policy automatically and portably using current supervisors. Other than the "dedicated user/login class/cgroup" scheme proposed by Jan (which can be considered best practice anyway), it'd be nice if they exposed this somehow (hand-waving SMOP ahead: duplicate the pid field in ./status and remove the working copy only when receiving a down signal). Anyway, I've been trusing supervision software more than whatever needs to be supervised since, like, last century, and I really like it this way ;) tks -- pica
Re: runit kill runsv
[sorry for replying late, catching up] * Laurent Bercot <ska-supervis...@skarnet.org> [20160627 18:05]: > On 27/06/2016 14:02, Joan Picanyol i Puig wrote: > >However, couldn't they know whether their child did not cease to run > >because > >of a signal they sent? > > I'm not sure about runsv, but s6-supervise is a state machine, and the > service state only goes from UP to FINISH when the supervisor receives a > SIGCHLD. The state does not change at all after the supervisor sent a > signal: it sent a signal, yeah, so what - it's entirely up to the daemon > what to do with that signal. I understand: supervisors only exec() processes and propagate signals, they have no saying in nor can expect what their effect is. > There's an exception for SIGSTOP because stopped daemons won't die > before you SIGCONT them, but that's it; even sending SIGKILL won't > make s6-supervise change states. Of course, if you send SIGKILL, > you're going to receive a SIGCHLD very soon, and *that* will trigger a > state change. Given that SIGKILL shares with SIGSTOP the fact that they can't be caught (and thus supervisors can assume a forthcoming SIGCHLD) signals (pun intended) that the exception should be extended? > >No, but neither can the admin enforce this policy automatically and > >portably using current supervisors. Other than the "dedicated user/login > >class/cgroup" scheme proposed by Jan (which can be considered best > >practice anyway), it'd be nice if they exposed this somehow (hand-waving > >SMOP ahead: duplicate the pid field in ./status and remove the working > >copy only when receiving a down signal). > > No need to duplicate the pid field: if s6-supervise dies before the service > goes down, the pid field in supervise/status is left unchanged, so it still > contains the correct pid. I suspect runsv works the same. Ah, ok, it didn't occur to me that pid 0 in supervise/status could be used to mean "never run or got SIGCHLD" > I guess a partial mitigation strategy could be "if supervise/status exists > and its pid field is nonzero when the supervisor starts, warn that an > instance of the daemon may still be running and print its pid". Do you > think it would be worth the effort? As well as the warning (which would make troubleshooting easier and might have probably avoided this thread), a robust automation enabling ui (in s6-svstat / s6-svok) would round this additional feature and make it yet more useful. keep up the good work -- pica
Re: Customise shutdown signal at the s6-rc level?
* Casper Ti. Vector[20170502 12:48]: > On Tue, May 02, 2017 at 08:51:19AM +, Laurent Bercot wrote: > > If I were to work on a more official, better integrated solution, I would > > do it at the s6-supervise level. I would not implement custom control > > scripts, for the reasons indicated in the above link, but it would > > probably be possible to implement a safer solution, such as reading a file > > containing the name of the signal to send when s6-svc -d is called. > > I see. Now I also think that using a `shutdown-signal' file seems to be > the least intrusive way. Considering the hangup problem, I think the > format of the file can be described as something like > > signal_1 timeout_1 signal_2 timeout_2 ... signal_n [timeout_n] > where the last timeout, when present, indicates that SIGKILL shall be > sent if that timeout elapses; the default is obviously > > SIGTERM Doesn't svc -wD -T1000 servicedir || svc -k servicedir do what you want for the "hangup problem" ?
Re: A better method than daisy-chaining logging files?
* Dewayne Geraghty [20190618 09:38]: > # ktrace -f /tmp/s-log.txt -p 83417 > ktrace: /tmp/s-log.txt: Function not implemented > > Its a preproduction box, everything optimised and stripped (no debug > symbols). Apparently you've stripped options KTRACE from your kernel config. Boot GENERIC just for this test. > I've worked with nullfs since 2004, probably a little delicate then, but > I've used extensively on customer sites and its proven to be ok. :) The > nullfs component is where the files are piped through, and not the > end-point destination which is ufs2 on an SSD. oh, ok, probably safe then regards -- pica
Re: A better method than daisy-chaining logging files?
* Laurent Bercot [20190618 08:22]: > >FYI: The fifo queue permissions, which the jail sees > >pr---w 1 mylogger www 0B May 31 13:27 apache24-error| > > Ah, so the www group is the one that writes to the fifo. Got it. > > Then you don't need mylogger to belong to the www group (and > it's probably better for privilege separation that it doesn't), > but you apparently need the logdir to belong to the primary group > of the mylogger user. There is no reason for the logdir to belong > to the www group. > > The error you got still strikes me as weird, and shouldn't happen > unless you have strange permissions for the logdir itself, or > FreeBSD is doing something wonky with gid checking. He is nullfs mounting some of these directories, wonkyness might happen. > For my peace of mind, I'd still like to see the permissions on your > logdir, and a ktrace of the error. * Dewayne Geraghty [20190618 09:16]: > On the logger, the files, as requested are: > > # ls -lrth /var/log/httpd | grep error ; ls -lrth /var/log/httpd/error > drwx-- 2 mylogger www 512B Jun 18 15:06 error/ > total 44 > -rw-r--r-- 1 mylogger www 0B Jun 18 15:06 state > -rw-r--r-- 1 mylogger www 0B Jun 18 15:06 lock > -rw-r--r-- 1 mylogger www41K Jun 18 16:04 current [...] > -rw-r--r-- 1 mylogger www 0B Jun 18 15:06 lock > -rwxr--r-- 1 mylogger www 2.7K Jun 18 16:59 @40005d088c11012cc9f4.s* > -rw-r--r-- 1 mylogger www 0B Jun 18 17:03 state > -rw-r--r-- 1 mylogger www 0B Jun 18 17:03 current > -rwxr--r-- 1 mylogger www64B Jun 18 17:03 @40005d088cd6113d5a5c.s* > [...] > # s6-svc -a /run/scan/apache24-error-log > # lh /var/log/httpd | grep error ; lh > /var/log/httpd/error > drwx-- 2 mylogger www 512B Jun 18 17:05 error/ > total 4 > -rw-r--r-- 1 mylogger www 0B Jun 18 17:04 lock > -rw-r--r-- 1 mylogger www 0B Jun 18 17:05 state > -rwxr--r-- 1 mylogger www 304B Jun 18 17:05 processed* > -rw-r--r-- 1 mylogger www 0B Jun 18 17:05 current Include -a to your ls flags, to show the directory's permissions for completeness. > with the resulting > s6-log: warning: unable to finish processed .s to logdir > /var/log/httpd/error: Operation not permitted > > This is on a box that lacks development tools, so tracing will take some > time to sort out; sorry. :/ Just add ktrace -id -f /var/tmp/s6-log.trace before your s6-log invocation and send the output of kdump -f /var/tmp/s6-log.trace afterwards. qvb -- pica