Hi Caspar

thanks for the heads-up; this is certainly an interesting project, but for me to start playing with it only makes sense if and when it has matured to the point where there is a minimum of documentation (-h/--help or something like that) and ideally some sort of revision control, too. I may be (barely) able to debug such low-level C code if I notice it misbehaving but to reverse-engineer what it is supposed to do is beyond my abilities.

Cheers
Ben

Am 22.06.23 um 19:16 schrieb Casper Ti. Vector:
On Thu, Oct 21, 2021 at 02:01:29AM +0800, Casper Ti. Vector wrote:
As has been said by Laurent, in the presence of a supervision system
with reliable logging and proper rotation, what `procServ' mainly does
can be done better by something like `socat' which wraps something like
`recordio', which in turn wraps the actual service process (EPICS IOC).
The devil is in the details: most importantly, when the service is to
be stopped, the ideal situation is that the actual service process gets
killed, leading to the graceful exit of `recordio' and then `socat'.

It is found that socat does not do I/O fan-in/fan-out with multiple
clients; it also assumes the `exec:'-ed subprocess is constantly present
(i.e. it does not handle IOC restarting).  So I have written a dedicated
program, ipctee (see below for link to source code), that does this.
I have also written a program, iotrap, that after receiving a
terminating signal, first closes the stdin of its children in the hope
that the latter exits cleanly, and after a tunable delay forwards the
signal.  This way IOCs are allowed to really run their clean-up code,
instead of just being killed instantly by the signal.

So the two wrapping programs need to propagate the killing signal, and
then exit after waiting for the subprocess; since `procServ' defaults
to kill the subprocess using SIGKILL, `recordio' also needs to translate
the signal if this is to be emulated.  `socat' does this correctly when
the `sighup'/`sigint'/`sigquit' options are given for `exec' addresses,
but its manual page does not state about SIGTERM.  `recordio' does not
seem to propagate (let alone translate) the signal; additionally, its
output format (which is after all mainly used for debugging) feels too
low-level to me, and perhaps needs to be adjusted.

Closer inspection of recordio revealed that it was designed in a smarter
way: after forking, the parent exec()s into the intended program, and
the children is what actually does the work of I/O forwarding.  This way
recordio (the children) does not need to forward signals.  Based on it,
I have written a program, recordln, that performs more line-oriented
recording: line fragments (without the line terminator) that go through
the same fd consecutively are joined before being copied to stderr.

At the facility where I am from, we use CentOS 7 and unsupervised
procServ (triple shame for a systemd opponent, s6 enthusiast and
minimalist :(), because we have not yet been bitten by log rotation
problems.  It also takes quite an amount of code to implement the
dynamic management of user supervision trees for IOCs, in addition
to the adjustments needed for `recordio'.  To make the situation even
worse, we are also using procServControl; anyway, I still hope we can
get rid of procServ entirely someday.

Source code for the programs above are available (licence: CC0) at
<https://cpaste.org/?fa30831511a456b7=#ECwUd1YaVQBLUokynQbRYZq5wvBvXXeXo3bQoeL2rL4L>
These programs can be tested with (in three different terminals):
$ ipctee /tmp/in.sock /tmp/out.sock
$ socat unix-connect:/tmp/in.sock exec:'recordln iotrap /bin/sh',sigint,sigquit
$ socat unix-connect:/tmp/out.sock -
Please feel free to tell me in case you find any defect in the code.
The dynamic management of IOC servicedirs is being developed, and will
be tested internally here before a paper gets submitted somewhere.


--
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman


Reply via email to