Re: utilities and write errors

Geoff Clare via austin-group-l at The Open Group Thu, 01 Jul 2021 03:46:16 -0700

Robert Elz wrote, on 29 Jun 2021:
>
>     Date:        Tue, 29 Jun 2021 09:49:40 +0100
>     From:        "Geoff Clare via austin-group-l at The Open Group" 
> <[email protected]>
>     Message-ID:  <20210629084940.GA8391@localhost>
> 
>   | You are wrong when you say it "printed it".  It tried to print it but
>   | failed to do so.
> 
> But how do you actually know?


Because it is a precondition of this discussion.  I.e. what we are
debating is what is the required behaviour if pwd does not write to
standard output (because an error occurred).

> In the full filesystem case you might,
> but how do you really tell the difference between
> 
>       { sleep 1; pwd; } | :
> and
>       pwd | sleep 1
> 
> In neither case does the output get through the pipe.  If the sleep
> wasn't there, and we just had
> 
>       pwd | :
> 
> then sometimes we might get one behaviour, and other times the other.
> 
> I assume you're not suggesting that it is the responsibility of the
> application (pwd in this case) to verify that data sent to a pipe gets
> received by some process at the other end?
> 
> And if not that, then what's the point - sometimes the output is lost,
> and there's no error message, or non-zero exit status.   Other times you
> apparently demand that there is.   Not very consistent is it?

It's completely consistent as far as pwd is concerned. If pwd
successfully writes the directory (to file descriptor 1) it exits with
status 0; if it fails to write the directory it exits with non-zero
status and reports the error.

The point is that pwd must not ignore write errors (when the actual
write to the file descriptor is done). We are just using EPIPE as an
easy way to (portably) set up an error condition.

>   | > You might prefer that it "fflush(stdout); if (ferror(stdout)) ..." but
>   | > there's nothing explicit in the standard that says that it has to do 
> that.
>   |
>   | There's nothing in the standard that requires pwd to be written in C,
>   | so any argument based on C functions has no merit.
> 
> There was no argument based upon C functions - the use of them was
> just a shorthand to make a point.
> 
> pwd does write to stdout - it does (the equivalent of, in whatever
> language it is written) printf("%s\n", directory);
> 
> Having done that it exits.   It is finished.

The standard says nothing about internal buffering; it just requires
pwd to write the directory to file descriptor 1.  It also states that
exit status 0 means "successful completion".  Together, these
requirements mean that a conforming pwd must not exit with status 0
if it did not write the directory to fd 1.

If an implementor chooses to buffer the output, then it is their
responsibility to check that the buffer is successfully flushed to
fd 1 before exiting with status 0.

> exit() as part of its processing flushes the stdio buffers.  Including stdout.
> That's when the actual write from the process happens - but it is far too
> late then for pwd to do anything about it, exit() never returns, no pwd
> supplied code ever runs again (unless it set an atexit() - but it has no
> need for that, and doesn't).   I see nothing in the standard allowing exit()
> to terminate the process with any exit status other than the one handed to
> it, do you?   What's more, it would be extremely unlikely for exit() to be
> writing messages to stderr, which as you mentioned in a previous message is
> a requirement if the exit code is not 0 (for most utilities).

As above, this is all irrelevant to what the standard requires.

As far as implementation detail goes, obviously if pwd uses stdio
buffering then in order to conform to the standard it must explicitly
fflush(stdout) and check there was no write error before exiting. 
I see from later in the thread that mksh has now been patched to do
exactly that. (Thanks Thorsten.)

> Which returns to my point quoted above - I see nothing in the standard
> that requires applications to manually flush file descriptors, and check
> for errors, before they exit.  If there is something somewhere which
> requires this, please point it out, and I'll file a defect report to
> have that fixed.

The phrase "flush file descriptors" is meaningless.  A stdio buffer
is flushed *to* a file descriptor.

All of the requirements in the standard about things being written
to "standard output" are requirements that something is written to
file descriptor 1.  Use of stdio (or equivalent) buffering is an
irrelevant internal implementation detail.

Another point is that there is no distinction in the standard between
write errors and other errors.  Your argument that utilities should
be allowed to ignore write errors is also an argument in favour of
pwd being allowed to ignore getcwd() errors, or chmod to ignore
chmod() errors, etc.  Obviously that would be highly undesirable.

> 
>   | The standard requires that pwd writes to stdout.
> 
> The question is just what that means in terms of the interfaces available.
> pwd called printf, printf reported no error.   Stdout has been written to
> as far as I'm concerned when that has happened.

Sorry, my fault for using "stdout" as shorthand for "standard output".
The standard requires that pwd writes to file descriptor 1.

>   | It applies to every utility for which writing to stdout is something
>   | the utility needs to do in order to be considered to have completed
>   | successfully.
> 
> That's never going to fly, it is no surprise that Solaris didn't bother
> to implement that (even if you can show where the standard actually
> requires it).
> 
>   | No they wouldn't.  The only reason to set SIGPIPE to be ignored is
>   | because you want EPIPE error conditions to be diagnosed instead of
>   | causing the process to terminate. So fixing these bugs is exactly what
>   | users want.
> 
> Two problems with that - first, why should there be a distinction between
> what happens (as visible to the outside world) when SIGPIPE is ignored and
> when it isn't?

Because that distinction is the entire reason that SIGPIPE was
introduced in the first place.  It was a way to avoid "something | head"
writing an unwanted error message if "something" wrote more than 10 lines.
And this was rightly made the default because that's usually what users
want.  However, users who want to see the error message can achieve that
by setting SIGPIPE to be ignored.

> And second, don't go assuming "the only reason" - you have no idea why SIGPIPE
> might have been ignored.

Okay, perhaps I should have said "main reason".  (Although I suspect
"only" is right, and I would be interested to hear of any other
legitimate reasons anyone can come up with.)

>   | >   | You are talking about pre-POSIX tradition.  The rules in POSIX.2 
> should have
>   | >   | put an end to that ancient dodgy behaviour when systems were 
> updated to
>   | >   | conform to POSIX.
>   | > 
>   | > Do you have some evidence that they did?
>   |
>   | I said "should have".  Obviously that didn't happen, at least for
>   | some systems.
> 
> Do you have some evidence it happened for any?   For all (relevant)
> utilities, not just pwd of course.

The GNU implementations (including bash builtins) of the POSIX utilities
do it right.  Of course, I don't know whether they were already
well-behaved in this regard before they were updated to conform to
POSIX.2-1992.

> This is an example of why standards bodies (which in general are 
> representative
> of almost nothing) MUST NOT ever attempt to legislate behaviour.
> 
> The objective is to document the behaviour that has been agreed (as 
> demonstrated by the implementations) so that later implementations know
> what is required to offer the same service, and users know what to expect
> when they use such a service (whatever it is, not just POSIX).

This is true for modern, mature, POSIX standards, but it was most
definitely not true for the original POSIX.1 and POSIX.2 standards.
POSIX.2-1992 in particular brought in a whole raft of requirements
that meant all implementations needed to make substantial changes
in order to conform. Today we take the benefits that this brought for
granted.  Proper handling of write errors should have been one of
those benefits.

> If you are certain that your way is the right way, the onus is upon you to
> convince the implementations of the merit of your argument, and have them
> change the way they work.

Now that mksh has been changed, it would appear you are the only hold-out
(of those who have participated in the discussion here).

Hopefully you will now see sense and agree that equivalent changes should
be made in NetBSD.

-- 
Geoff Clare <[email protected]>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: utilities and write errors

Reply via email to