> In my opinion, it is incorrect for a shell utility (such as sed or any > of the textutils) to respond to EPIPE for stdout (and, arguably, any > other output file) by
> 1) printing a message on stderr > 2) exiting with non-0 status It would be quite a nuisance to do anything else, so the question is, is there sufficient reason to bother? What harm results if the program does this? This is tricky to answer because a bunch of issues have gotten jumbled up. Please bear with me. Let's call the behavior I proposed above the "non-error EPIPE" proposal. I'll expand below on four points: 1) Eggert's reply sent me paging through various kernel sources and docs and cleared up the bug in my thinking (I had a misconception about when SIGPIPE could be delivered). To solve the immediate bug that some of my users have reported does _not_ require the non-error EPIPE behavior: my bad. 2) Another proposal, called it the "explicit SIGPIPE SIG_DFL" proposal, says that the various utils should explicitly set a SIG_DFL handler for SIGPIPE on start-up if they are written in a style that assumes SIGPIPE should be received and cause an immediate, non-0 exit. This would be a comparatively easy change to make and it would mean that the utils are more defensively programmed, at least from the traditional unix perspetive. 3) The timing of signal delivery in POSIX is weakly specified. If certain additional assumptions about signal delivery timing are not made, beyond what the standard says, then the non-error EPIPE proposal is important. 4) SIGPIPE and EPIPE handling in traditional unix is a poorly designed corner case that interacts badly with traditional shell design. The non-error EPIPE proposal (and equivalent variations on it) are the only robust way I see to handle programming with pipes, and thus should still be considered for adoption as a convention for GNU systems. * The originally reported bug and its implications The bug that gave rise to the thread was that on some systems, shell scripts containing idioms such as: if test -z "`sed .... | head -1`" ; then were sometimes (not even consistently) generating (obviously undesirable) stderr output from sed of the form: sed: Couldn't close {standard output}: Broken pipe which is the generic response of sed to any error in ck_fclose. The glitchiness of the bug reinforced my misconception about signal timing. Similar bug reports have been generated regarding other utils besides sed. But, apparently, either the shell executing this code is sometimes execing sed with SIGPIPE ignored (which is not the right thing to do) -- or the users reporting this are using a buggy kernel. So, I was wrong -- these bug reports don't provide a reason to implement the non-error EPIPE proposal. * the explicit SIGPIPE SIG_DFL proposal The various utils could arguably be improved if they explicitly restore SIGPIPE to SIG_DFL when they start. The argument is that programs execing them shouldn't be assuming that they handle EPIPE gracefully and thus have no legitimate reason for execing them with SIGPIPE ignored; the utils are written with the assumption that SIGPIPE has SIG_DFL behavior but have at best weak legitimate reason to assume it has that handler when they start. Thus, explicitly restoring SIGPIPE to SIG_DFL is just defensive programming (it would have prevented the originally reported bugs, even if they are caused by a buggy shell). * the formal guarantees of signal delivery are weak POSIX appears to be deliberately non-specific about the timing of signals (IEEE 1003.1-1990 B.3.3.1.2 "Signal Generation and Delivery"), though the idea of an "imprecise SIGPIPE" (one that's delivered _after_ later system calls) was not anticipated and presumably not intended. Rather everyone (except me, until recently :-) assumes that all pending signals generated by a system call will be delivered no later than during the return from that system call. If that's not right -- if delivery of non-blocked, non-ignored SIGPIPE _can_ somehow wind up being delayed past the point at which a subsequent system call has effect, then I'd have to go back to at least part (1) of the non-error EPIPE proposal. (Which, I agree, would be inconvenient but it would be the only way, under that circumstance, to not have glitchy shell scripts). I've only looked into BSD variants and Linux with regard to signal timing. Both are notably quite strict about delivering all pending signals on entry to and exit from system calls. So this is probably a purely hypothetical concern. * Traditional SIGPIPE and EPIPE handling is inerhently non-robust Part (2) of the non-error EIPE proposal (about exit status) is nearly orthogonal to everything else. It's mostly a moot point in normal shell scripts (since the exit status of non-final programs in pipelines is, unfortunately, nearly irrelevant) -- I was thinking about more modern shells in which the exit statuses of non-final pipeline programs might be collected and significant. There is an intersting (I think) design problem still lurking that supports my original opinion about (2) (and therefore, really, (1) as well). Exit status of non-leading programs in pipelines is a weak spot of shell programming. Using `sh' variants, those statuses are ignored, presumably in anticipation that a SIGPIPE death is a more common case than than some kind of "real" error that you'd want, for example, to trip a `set -e' exit from the shell. When you need an absolutely robust script, with status checking for all the programs in what is logically a pipeline, you don't (in traditional sh) have the option of using a simple pipeline -- you have to resort to some ugly idioms (and thus the traditional design discourages robust programming by precluding simple programming in robust applications). It isn't hard to imagine a more modern shell design in which the exit status of all programs in a pipeline are collected and significant, at least optionally. That would make it possible to detect and handle errors from all of the programs in a simple pipeline, not just the last one. Using the notation "|+" for this, one might write: set -e sort "$somefile" |+ head -1 > ,first-one expecting the script to fail if "$somefile" can not be opened or the `sort' fails for some other reason. But what if `sort' will die of a SIGPIPE? Should that cause the script to fail? Well, if the non-error EPIPE proposal had effect, then a useful answer is "yes, it should cause the script to fail" but if the non-error EPIPE proposal does not have effect, then there is no robust answer. Consider that `sort' may get SIGPIPE or EPIPE for two very distinct reasons. (1) it might get that error because the consumer for its stdout has stopped reading. This is an ordinary and expected condition and means only that `sort' is free to exit at will. (2) it might get that error because it has forked a subprocess to help with the sorting and that subprocess has died prematurely; this is a real error and should cause the script to fail. So, I think it's useful for utilities to distinguish between anticipated non-error EPIPE returns (or SIGPIPE deliveries) and unanticipated EPIPE/SIGPIPE events -- and to reflect the difference between the two in their exit status and stderr output. This would be a step in the direction of making GNU better than traditional unix. I admit -- it's a pretty esoteric point. It might be a more pressing one if the idea of a shell that collects all exit statuses in a pipeline were further developed and seen to be useful (or not). Personally, I think that's a good direction to go in -- at least to put some design thinking into. The pipeline/software tools style of programming has a lot of nice performance and ease-of-use charateristics that modern hw and kernels just keep making more and more applicable. SCSH gives a good existence proof that sh is very far from the last word in shell programming. While we're in there, we could think about other (related) improvements such as replacing the stream-RDB tools with new versions that have more robust field syntaxes and more consistent options and defaults. -t _______________________________________________ Bug-textutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-textutils