On 22/10/15 21:01, Jim Meyering wrote: > On Thu, Oct 22, 2015 at 12:22 PM, Assaf Gordon <[email protected]> wrote: >> Hello, >> >> In a long-running pipeline, we've encountered the following error message >> with 'csplit': >> csplit: write error for ‘[FILENAME]` >> >> It is very likely stemming from out-of-disk-space situation, but because >> there was no errno printed, it's hard to know for sure, making >> troubleshooting a bit frustrating. >> >> The error is printed in 'close_output_file()', in response to checking >> 'ferror()'. >> >> I wonder if it would be beneficial to add the errno information to the error >> message. >> >> >From a cursory look it seems 'close_output_file()' is (almost?) always >> >called after >> 'dump_rest_of_file()' or 'save_line_to_file()' which themselves use >> 'fwrite()'. >> >> So it could perhaps be assumed that if if 'ferror()' indicates an error on >> the output stream, >> then the last operation was fwrite? >> The downside is that if my assumption doesn't hold, it could print a >> misleading errno information. >> >> The following patch is a suggestion (not fully tested): >> >> === >> diff --git a/src/csplit.c b/src/csplit.c >> index d966df5..d2a0228 100644 >> --- a/src/csplit.c >> +++ b/src/csplit.c >> @@ -1004,7 +1004,7 @@ close_output_file (void) >> { >> if (ferror (output_stream)) >> { >> - error (0, 0, _("write error for %s"), quote (output_filename)); >> + error (0, errno, _("write error for %s"), quote >> (output_filename)); >> output_stream = NULL; >> cleanup_fatal (); >> } >> === > > Hi Assaf, > > Thanks for investigating. > > That's a fundamental limitation of streams. > When detecting that error via ferror, the errno value is not > guaranteed to be useful. From what I recall of the fwrite > specification, even immediately after a failed fwrite, errno is not > guaranteed to be useful, but in practice, it always is, > so coreutils programs do rely on that.
Right, and whether to check all writes is a a trade off between maintainability and precise errors. Checking every output function in addition to checking ferror on close (as is done in close_stdout for example) will ensure _precise_ errors in all edge cases, while just relying on ferror will always diagnose errors, though in certain edge cases not with a precise error message. as you've seen. Details at: http://www.gnu.org/ghm/2011/paris/slides/jim-meyering-goodbye-world.pdf >> I'll be able to troubleshoot and provide more information in couple of days. >> A more elaborate solution is to save the errno after 'fwrite()' in a >> variable, >> and print that variable. I can send such a patch if that's a better idea. > > Yes, this sounds better. Yes the tradeoff discussed above is a bad one in csplit's case as it only has a single fwrite() call. We should check the output of that and call into cleanup_fatal() to exit early to avoid redundant processing when one can't write the current output file. This could be tested with something like: timeout 10 yes | csplit ... /dev/full thanks, Pádraig.
