Re: sh: killing a pipe head from the tail

Robert Elz Wed, 08 Dec 2021 04:54:58 -0800

    Date:        Tue, 7 Dec 2021 18:38:32 +0100
    From:        Edgar =?iso-8859-1?B?RnXf?= <e...@math.uni-bonn.de>
    Message-ID:  <20211207173832.gr29...@trav.math.uni-bonn.de>


  | and the obvious solution (apart from patching collectd) is to trap - PIPE 
  | before calling envstat.

That would not have worked, non-interactive shells are forbidden from
allowing the script to change the status of a signal which is ignored
when the shell starts:

So sayeth POSIX (see its description of the trap special builtin in
XCU 2.14):

        Signals that were ignored on entry to a non-interactive shell
        cannot be trapped or reset, although no error need be reported
        when attempting to do so. An interactive shell may reset or catch [...]

It turns out that our shell, entirely by accident, almost has a mechanism
that can be used for your needs, not to reset ignored traps, but to discover
the pids of the processes in a pipeline.  Almost, and that could be fixed.

The "jobid" builtin command prints (by default) the pids of processes in
the current job.   A pipeline is a job, and the processes are created left
to right, and entries are made in the jobs table for each process as it
is created ... so in the final process of a pipeline, the jobid command
will print the process ids of the processes that are running the earlier
elements of the pipe (not itself, the shell has forked before its own pid
is added to the parent shell's jobs table, which the subshell has copied
via the fork()).

This info needs to be collected quickly, as the jobs table is cleared in
a subshell whenever a new job (think process, including command substitutions)
is created - it is not cleared as part of the process of creating a subshell
to allow things like pids=$(jobid) to be executed meaningfully -- in that
more normal usage, the "current job" will be the last background process
group created ... pipelines are something kind of special).

Unfortunately, when you're in a subshell already (which a process in a pipeline
effectively is) and you try to make another one, the jobs table is always
cleared there, this is so

        ( x=$(jobid) ...  )

only gets the jobs from in that ( ) subshell, not ones from its parent which
it cannot do anything much with.

The effect is that while the jobid command, run in a pipeline, does output
the process ids of the processes to the left of the current element of a
pipeline, there's no good way to get that information to anywhere useful
without perhaps sticking it in a file and reading it back - but that is ugly.

I could do something like adding a -v option (-v varname) to allow the output
from jobid to be stored in a variable, if that seems like a good idea
(if someone wants this, please send a PR requesting it, and in the PR
explain what it is wanted for, so we have something to reference when
someone in the future says "WTF?")

For a lark, I did a very hackish, very limited, non-error checking (and
badly implemented) version of this in about 5 minutes (not suitable for
anything except trashing), and used it in the following script:

{
        echo hello
        sleep 5
        echo foo
        sleep 5
        echo bar
        sleep 5
} | {
        jobid -v xxx
        echo xxx=$xxx

        while read line
        do
                case "${line}" in
                *foo*)  kill -9 $xxx;;
                esac
                echo "$line"
        done
}

That worked, the shell running the left side of the pipe (which would
be replaced by the process in the usual case that it is a simple command
there, like perhaps, envstat) was killed after it sent the "foo" line,
the output looked like:

./sh /tmp/Sct
xxx=23777
hello
foo
[1]   Killed                  echo hello; sleep 5; echo foo; sleep 5; echo b... 
|
      Done                    jobid -v xxx; echo xxx=${xxx}; while read line...

Note that the code to the right of the | did the kill when
it saw "foo", then echoed foo (as it had previously done with
"hello"), then went back to the "while read line" again - that
failed when the left side got the kill signal and vanished,
the whole pipeline then exited, and the shell (for some reason,
I don't think it should in a non-interactive shell - I will look
into that) reported the exit status of the pipeline via the jobs
command (that it did was helpful for this example).

Aside from not having the -v xxx arg to jobid (which is needed to make this
as simple as it is) you can try this for yourself now - the shell
in NetBSD HEAD, and 9_STABLE (and 9.2 I believe, less sure about earlier
versions of -9 and this is probably not going to work in -8 or earlier)
should have all this working as intended (just without a way to make it
simple to use for this purpose).   If you just put "jobid" very early in
your pipeline script's right side you should see on stdout (or wherever
you redirect its output) the pid(s) of each process running to the left
of the pipe.

Note that there are no race conditions here, the output is reliable, and
the process already exists, and so can be killed - if that process is
going to run some other command, there will be no guarantee that has
happened yet (ie: if we had

        envstat | { jobid -v P; kill -9 $P ; }

it is quite certain that $P would refer to the process id for the left
side process of the pipe, but when it is killed, when it happens as
quickly as here, that might still be the shell, envstat might not have
replaced the shell yet - that all depends upon how the kernel process
scheduler manages things.

Once again, all of this is an accident, not planned, some shells start
the pipeline in the other direction (right to left) and if we went that
way this would not work in a useful manner at all (there are some benefits
to starting the processes in the other order).   It is also purely NetBSD.
FreeBSD have a jobid command, simpler than ours, but I think the same for
this usage (no -v of course) - but it doesn't work for this purpose, their
jobs table clearing strategy is different than ours I believe.

Lastly:

mo...@rodents-montreal.org said:
  | I'm not sure what I think of the idea that a pipe's reader can do things to
  | the pipe's writer without the writer's cooperation. 

I agree with that - but this isn't quite that, the info isn't available
(unless deliberately by design of the script writer) to an arbitrary
process on the right side of a pipe, only to the script which is invoking
it, and that I don't think is nearly such a problem.

kre

Oh!   While re-reading, and correcting all the above, I had an idea, which
might just be workable now.   And it worked.   This one you should be able
to use now, no shell mods required (again, perhaps only in a fairly
recent version of sh, there were bugs in this area until not all that
long ago).   Here is the modified script ... spot the variation:

{
        echo hello
        sleep 5
        echo foo
        sleep 5
        echo bar
        sleep 5
} | { jobid; cat; } | {
        read xxx
        echo xxx=$xxx

        while read line
        do
                case "${line}" in
                *foo*)  kill -9 $xxx;;
                esac
                echo "$line"
        done
}

if there were more elements to the left of the | into jobid;cat
the "read xxx" could be "read pid1 pid2 pid3 ..." as needed.

Note that we're certain that the final pipeline element gets the
jobid output before anything from further left in the pipeline,
as the "cat" which forwards that output through the pipe doesn't
start until after jobid has sent its line.

The script worked for me (output looked identical, other than
the pid value, as the previous time with the hacked sh), it
should work for anyone with a modern NetBSD.   Try it Edgar.

Re: sh: killing a pipe head from the tail

Reply via email to