Re: [1003.1(2016)/Issue7+TC2 0001138]: Add strsignal(), sig2str() and str2sig() to the standard.

Robert Elz via austin-group-l at The Open Group Wed, 09 Sep 2020 13:10:09 -0700


austin-group-l@opengroup.org said:
  | Some applications might find it useful that 0 is translated to "EXIT".


Apart from the implementation of trap -l (which isn't even a standard
option) name a single one?   That "EXIT" is translated to 0 makes it
fractionally easier to implement "trap" I guess, as it avoids the shell
needing to to something like

        if (strcmp(arg, "EXIT") == 0)
                sig = 0;
        else if (str2sig(arg, &sig) == -1)
                error();

Do you really believe that saving those two extra lines of code (in shells
with a trap command which implement an exit trap, which I don't think includes
the csh family) is worth specifying this as a requirement in the standard.

And that given that the alternative is to have all of the signal sending
programs (there are more of them than shells - there must be, as each
shell with its builtin kill (even csh has that) counts as one of them) have
to do

        if (strcasecmp(arg, "EXIT") == 0)
                error();
        if (str2sig(arg, &sig) == -1)
                error();

Note that one can't simply call str2sig() and then test "if (sig == 0)"
on the result, as "0" is valid for arg, though "EXIT" is not.

This is all perverse, "EXIT" is not a signal, there is no SIGEXIT defined
anywhere I know of (certainly not in posix) - there is no justification at
all for a function defined as working on signals also happening to process
a sh specific trap arg.   Of not unless the function is part of the shell,
in which case it has no business being here at all.

Beyond that, even the reference implementation (the Solaris, now Oracle)
one for these, doesn't promise in its doc to support or generate EXIT.
Its implementation apparently does, but it isn't documented as doing so,
even there, so they could remove that functionality at any time - applications
shouldn't be relying upon undocumented features after all.

Are we really planning on requiring functionality that the reference
implementation doesn't specify exists?

austin-group-l@opengroup.org said:
  | Other applications can simply avoid passing 0 to sig2str(). 

That's not generally an issue, as other applications (assuming there
are any other than kill -l) generally would only pass a signal number
(typically of a received signal) to sig2str, and 0 is not going to be
one of those.

Incidentally, while POSIX says of kill -l (with an arg, there's no question
that without it, EXIT is not included in the output list of signals, and
no shell I know of does include it .. though for some bizarre reason, dash
includes "0" in its list) is:

    If an exit_status operand is given and it is a value of the '?'
    shell special parameter (see Section 2.5.2 (on page 2274) and wait)
    corresponding to a process that was terminated by a signal, the
    signal_name corresponding to the signal that terminated the process
    shall be written.

If a process has been terminated by a signal, then $? will not be 0,
hence '0' cannot be the arg to kill -l when used in this way.  It continues:

    If an exit_status operand is given and it is the unsigned decimal
    integer value of a signal number, the signal_name (the symbolic constant
    name without the SIG prefix defined in the Base Definitions volume of
    POSIX.1-202x) corresponding to that signal shall be written.

Since 0 is not the integer value of any signal number, those are required
to be between 1 and NSIG_MAX-1 (in the new version) 0 cannot be the arg
to "kill -l" used this way either.  It continues:

    Otherwise, the results are unspecified.

So if 0 is passed to kill -l ("kill -l 0") there is no specified result,
we are not required to output EXIT (which would be a bizarre requirement)
so the one real existing client of sig2str() which exists doesn't actually
need the EXIT result.

Nor does anything else.

But that is kind of beside the point, the driving function of this interface
is str2sig() not sig2str() - sig2str() just happens to sort earlier.  As above,
to be used in the trap command, without extra work, str2sig() would need to
process EXIT.   But as above, providing that function is not the right way to
do things, str2sig() should stop being required to parse EXIT and return 0,
that is a shell specific function (as part of the trap command) - it is not
in any way system dependent, so can trivially be handled by the shell
implementation, which (in several cases) also needs to handle "ERR" "RETURN"
"DEBUG" or some subset of those in addition to "EXIT" - doing so is not
a real burden.

sig2str() translates 0->EXIT merely because it is the inverse function of
str2sig() not because there is a single sane reason for it to do so, other
than that.   Remove "EXIT" as a defined input to str2sig() and it can
seamlessly go away from sig2str() with no harm caused at all.   That's what
should happen.

Note that implementations can extend the interface, and recognise whatever
additional (non-required) strings they like, so if an implementation wants
its str2sig() to process EXIT and return 0 as the signal number, that's OK.
It can also process "DEBUG" "ERR" "RETURN" and anything else desired, and
return suitable (fake) "signal number"s for those as well.   But applications
(or portable applications) should not be depending upon any of that.

[Aside: when given "kill -l 0", all ash derived shells (dash, freebsd, netbsd),
and yash, generate an error, zsh, mksh, and pdksh simply say "0" (but 0 is
not special for that, they do the same thing for any small integer (0<=n<128)
which is not a valid signal - they just output the input integer), bash,
bosh, and ksh93 say "EXIT" (ksh93 is like the other ksh's and zsh, and
for other invalid signal numbers, just outputs the number, but bash and bosh
generate an error for other invalid signal numbers - doing that, while not
generating an error for the just as invalid 0 seems inconsistent to me ...
but since this is unspecified behaviour, none of those are wrong, but the
breakdown is interesting.]

austin-group-l@opengroup.org said:
  |  Log files are intended for a technical audience,

Such a technical audience can handle "signal 8" just as easily
as FPE - probably more easily, as if they don't already know what
signal 8 is, when they look in the relevant header file, they'll
almost certainly see not just "#define SIGFPE 8" but also a comment
that explains what SIGFPE is, which is an improvement.

austin-group-l@opengroup.org said:
  | who are more likely to want just the signal name, not the textual
  | description that strsignal() 

Is there evidence to support that assertion?   I believe I qualify as
"technical audience" (I certainly read log files from time to time - so
do non-technical people incidentally) and I know I'd much rather see
"floating point exception" than "FPE" or "SIGFPE".


austin-group-l@opengroup.org said:
  | Using just the signal name also makes log files more compact. 

Now you're really stretching for justifications - if it rarely occurs,
no-one cares about that at all.  If it occurs often, then it is the same
string (exactly) every time, and so is an ideal compression target, and
so for anyone who cares about the size of log files, it really makes no
difference.


austin-group-l@opengroup.org said:
  | Just because you can't imagine it being useful to specify them, doesn't mean
  | that nobody will find it useful in future that they were specified. 

More stretching.   Can you give a single plausible use of these things
(the way that the real time signals are named from sig2str()) to justify
this.  We don't need this specified, and certainly not at that level of
detail.

austin-group-l@opengroup.org said:
  | Given that there are no differing implementations at the moment,

If this interface does get added to the standard, I would assume that
NetBSD would implement it (and in the man page say something like "These
interfaces are for POSIX conformance only, and should not be used by
applications") and that implementation would simply call the existing
NetBSD functions to do the work (no duplication of effort) and those
return the names we have for the real time signals, not RTMIN/RTMAX
generated strings.   So there would be one for sure.


austin-group-l@opengroup.org said:
  |  I see no reason not to specify it.

The reason is that the specification is not required, and when not
required, implementations should be allowed to differ.   If there was
some rational reason for conformity here, you'd give it, since you
haven't, I'm assuming that there isn't one (just the FUD "someday there
might be").


austin-group-l@opengroup.org said:
  | The tendency over recent years (since the Austin Group was established and
  | the development process was opened up) is for people to report things that
  | are not specified and we end up adding text for them. 

Sure, for things that ought to be specified (even in the cases where for
various reasons it cannot be) that's reasonable.   But where there's no
reason for everything to be the same, allow implementation differences.

What's more, there are plenty of other cases where things that could easily
be specified aren't.   To borrow from the example I used earlier - POSIX
could actually specify that the value of SIGFPR is 8.   But it doesn't (and
shouldn't).   Are you seriously telling me that if I filed a bug report
saying that the value of SIGFPR is not specified, and I think it should be,
that you would add text defining it?   And all the other signals (or at least
the ancient ones, which everyone agrees on).   (I seem to remember that a
few of them are defined, but just the ones that tend to be used by number
by people and scripts, like that SIGHUP is 1, as "kill -1 1" is so common).

Similarly, whenever a struct is specified, it is (properly) done as "shall
contain the following members" - nothing specifies what order they are to
appear in, or that no other members are allowed to exist - implementations
are allowed to differ, as they should be.   This is as true for new POSIX
invented structs (like posix_dent from another issue) as for older existing
ones where pre-existing implementations might be argued as restricting the
amount that can be specified.  That's not the case with new structs, the
group could specify them very precisely - but doesn't, correctly, because
there is no need for that, and implementations should be allowed to differ.

The same is true here.


austin-group-l@opengroup.org said:
  | I think it's better to include those details from the start rather than
  | risking ending up in a situation where they are reported as missing but it's
  | now too late to add them because implementations have diverged.

If the implementations diverge, then there never was a reason for them
to be specified (if one particular format was needed for something, that's
what the implementations would provide).  It is where every implementation
does things the same way that there might (just might, it can also occur
just because of copying) be a reason to actually specify that method, and
then it is possible.


austin-group-l@opengroup.org said:
  | Huh?  It's specified in the C standard, as any C programmer should know.

Did I miss something, or is it specified somewhere that only C programmers
are allowed to read the POSIX standard?

This is a trivial issue - there's no need to solve it, as where it is being
used shouldn't be there at all (see above) - but if that isn't accepted, and
this unnecessary over-specification is retained, fixing the wording to make
things clear and precise would be simple.  If it is retained, just do it.


austin-group-l@opengroup.org said:
  | > Why not?   It is much easier, if it happens to be needed, to do
  | >   kill -s rt03 pid
  | > than
  | >   kill -s rtmin+3 pid

  | Why is that easier? Because there are 3 fewer characters to type?

That's enough of a reason for me (though it turns out to be 4 less,
the actual name is rt3 - as you pointed out earlier, no-one usually
uses these things, I forgot its name and didn't check before sending
that message - I did subsequently).

Of course, the rtmin+3 form is even worse, as the + is shifted, so for that
one I have to press 2 keys.  But as you'd probably expect that I should
actually be using RTMIN+3 there's a whole bunch more shift key use there,
and most of the time I'd probably end up entering RTMIN+# just because of
not letting the (almost permanently held down) shift key up soon enough.

I'll retain our names thanks.

austin-group-l@opengroup.org said:
  | We're not using 1200 baud terminals any more. 

What does that have to do with anything?   I know I don't type at
nearly 1200 baud, not even 300 baud, though I can (possibly, and
most likely not consistently) beat 110 baud I think.   Maybe.

Can you type at 1200 baud?   120 chars/sec?   Really?   Good typists
(not me) tend to get to about 60 wpm, the record is apparently 212 wpm.
That's about 1000 ch/min or 16 ch/sec (astounding to me that anyone could
even imagine that kind of rate) - but even that's not even close to 120 ch/sec
(which would be something around 1400 wpm).

The rate at which output appears is completely irrelevant here, and
that's where increasing comms speeds make a difference.  Not typing, or
not after we went past 110 baud.

But these names were added to NetBSD long before my time, it wasn't me
who did it, I don't know the original justification.   All I know is
that they exist, and are what "kill -l" from sh has reported for a very
long time, and so is not going to change.


austin-group-l@opengroup.org said:
  | I imagine applications would only use str2sig() when given a signal name
  | (or "EXIT") as some kind of input, such as a command line argument.

Agreed.   Good.   In that case, that input can be whatever the system
requires.   For the regular signal names, consistency is fine - but as
you said earlier, and I agree, no-one uses the RT signals this way (your
query on why do they need names) - that same argument applies here to
there not needing to be any standard for what sig2str() maps the RT
signals into, agreed?

austin-group-l@opengroup.org said:
  | [Perhaps...] the text should say something like:
  |     If <i>signum</i> is a valid, supported signal number (that is, one
  |     for which <i>kill</i>() does not return -1 with <i>errno</i> set to
  |     [EINVAL]) 

that would solve the problem, but seems like a kind of convoluted (or
perhaps, bloated) way to do it. kill(2) (from POSIX XSH) says of EINVAL:

  [EINVAL]  The value of the sig argument is an invalid or unsupported
            signal number.

So, by referencing kill() returning EINVAL in errno, all you're effectively
doing is saying that "sig is [not] an invalid or unsupported signal number".

So you're effectively saying the exact same thing twice - once positively
"If <i>signum</i> is a valid, supported signal number" and then, by reference,
negatively, effectively after following the reference: "that is, one which
is not an invalid or unsupported signal number").  Just delete the
parenthetical part - the result would say the same thing, just as precisely.


austin-group-l@opengroup.org said:
  | That is exactly what I was trying to do.  The table has no table number, so
  | the only way to refer to it is by some descriptive means.

That's fine, but the one obvious way to descriptively refer to a list in a
table would be to use the word "table" somewhere in that description, wouldn't
it?  It turns out that there is just one table in <signal.h> (in XBD) so
actually all it would take would be to change "for which a symbolic constant
is defined in the <signal.h>" into "for which a symbolic constant is defined
in the table in <signal.h>", but for future proofing, and just to make it
easier for readers, it would be better as "for which a symbolic constant is
defined in the table of signals in <signal.h>", and to that can be added a
page number, as is done plenty of other places in the standard, producing
something like "for which a symbolic constant is defined in the table of
signals on page 323 in <signal.h>" (or something equivalent).   Since we're
looking there already, note that <signal.h> from XBD does just that...

        the General Terminal Interface (see Section 11.1.9, on page 185),

Whatever the standard form for the reference is would be fine of course.


austin-group-l@opengroup.org said:
  | No, it isn't.  It is more likely that there is already no consistency
  | between those other applications.

I think you just made my point that this interface does not need
standardising.   What you described would not be happening if applications
used a standard function like sig2str() and str2sig().  If they did,
they'd all produce the same output, and accept the same input (whatever
that happened to be), and there would be consistency amongst those
applications.   Desiring that would be the one reason for these interfaces
to be in the standard.   Now that you have accepted that there is no such
consistency, you're also accepting that applications do not use, and
hence do not need, these interfaces.

Hence there is no good reason to include them in the standard at all.

If on the other hand, you do want to promote consistency amongst applications
on a system, then your point here fails - it then becomes more important for
an application to be consistent with others on the system than with other
copies of itself on other systems.


While I am here, to avoid more bugnotes, and reduce the number of e-mail
messages, I'll include a reply to bugnote:4980 here (the oddly attributed
quotes below are really from joerg.schill...@fokus.fraunhofer.de).

nore...@msnkbrown.net said:
  | str2sig() and sig2str() are widely used since approx. 1995

Widely used by what exactly?   List 20 or 30 apps that use them (which
wouldn't really count as "widely" but would be a start).


nore...@msnkbrown.net said:
   | Regarding what is in the standard and what is not.....

I'm not going to go through that line by line, most of it is
irrelevant to anything here (particularly yet another diatribe
relating to waitid()) and like it or not, the definitions of the
strl*() functions do have problems - there is no "best" for the
string functions, which is appropriate depends upon the actual
needs of the application - certainly any wholesale "convert
everything to strl*()" is brain dead.  The strl*() functions have
valid uses, so do the strn*() functions, and even good old strcpy()
(and friends).   And I think there are a whole bunch more less
popular variations.


nore...@msnkbrown.net said:
  | The interesting aspect of this meta discussion is that in the 1980s, it
  | usually did take 1-2 years for a new useful idea to appear in other UNIX
  | versions as well.

Since these functions apparently appeared in the 1980's, and yet
did not appear in other (or at least, most other) UNIX versions (certainly
not within 1-2 years) can we assume that you agree that they were not
useful ideas?  Useful utility functions for the shell most probably,
but not much more than that.

nore...@msnkbrown.net said:
  | Not that we have a standard, this takes longer. 

Longer to get into the standard, yes, that was always going to be
the case, as new interfaces should be demonstrated as useful by
being copied into other versions (as you indicated would happen, and
did for many other things) first, and then standardised after that.
Necessarily that takes longer.  It also doesn't help this that new
versions of the standard are (for good reason) rare - so there can
be significant delays encountered, even if something useful is added,
before it actually appears.

But if you're suggesting that the existence of the standard makes it
slower for useful ideas to be copied, then I can't understand why that
would be so, the relationship seems unlikely to me.   Eg: to use your
own example, the strl*() functions have been copied almost everywhere,
and that all happened quite quickly - that they haven't been added to
the standard (and even that they have problems) didn't impact that,
they do have uses, so were widely implemented, fairly quickly.

It seems to me as if you're saying "I think xxx that is in system yyy
is useful, but it hasn't been copied to other systems - that must be
because it isn't in the standard".   Much more likely the explanation
is that it isn't really all that useful after all.

Now it is true that if such an interface is pushed through the standards
process, and included in the standard, then even if not useful,
implementations are likely to implement it - not because it is needed,
but just to conform to the standard ("what's one more useless piece of
junk that no-one uses going to harm us?")   But deliberately allowing that
to happen - forcing functions into the standard with the sole objective
of forcing implementations to implement something they normally wouldn't
bother with is abhorrent, and should never be condoned, by anyone.

Unfortunately, I suspect that is what is happening here.

nore...@msnkbrown.net said:
  | I would be happy, if problems like this one could disappear. 

So would I.   But does this mean that you're willing to withdraw
bugid:1138, and by so doing, make this exact problem dissapear?
Not other ones like this one as well, unfortunately, but one is
better than none.

That would be welcome news.

kre

Re: [1003.1(2016)/Issue7+TC2 0001138]: Add strsignal(), sig2str() and str2sig() to the standard.

Reply via email to