On 23/10/15 00:40, Pádraig Brady wrote: > On 22/10/15 21:01, Jim Meyering wrote: >> On Thu, Oct 22, 2015 at 12:22 PM, Assaf Gordon <[email protected]> wrote: >>> Hello, >>> >>> In a long-running pipeline, we've encountered the following error message >>> with 'csplit': >>> csplit: write error for ‘[FILENAME]` >>> >>> It is very likely stemming from out-of-disk-space situation, but because >>> there was no errno printed, it's hard to know for sure, making >>> troubleshooting a bit frustrating. >>> >>> The error is printed in 'close_output_file()', in response to checking >>> 'ferror()'. >>> >>> I wonder if it would be beneficial to add the errno information to the >>> error message. >>> >>> >From a cursory look it seems 'close_output_file()' is (almost?) always >>> >called after >>> 'dump_rest_of_file()' or 'save_line_to_file()' which themselves use >>> 'fwrite()'. >>> >>> So it could perhaps be assumed that if if 'ferror()' indicates an error on >>> the output stream, >>> then the last operation was fwrite? >>> The downside is that if my assumption doesn't hold, it could print a >>> misleading errno information. >>> >>> The following patch is a suggestion (not fully tested): >>> >>> === >>> diff --git a/src/csplit.c b/src/csplit.c >>> index d966df5..d2a0228 100644 >>> --- a/src/csplit.c >>> +++ b/src/csplit.c >>> @@ -1004,7 +1004,7 @@ close_output_file (void) >>> { >>> if (ferror (output_stream)) >>> { >>> - error (0, 0, _("write error for %s"), quote (output_filename)); >>> + error (0, errno, _("write error for %s"), quote >>> (output_filename)); >>> output_stream = NULL; >>> cleanup_fatal (); >>> } >>> === >> >> Hi Assaf, >> >> Thanks for investigating. >> >> That's a fundamental limitation of streams. >> When detecting that error via ferror, the errno value is not >> guaranteed to be useful. From what I recall of the fwrite >> specification, even immediately after a failed fwrite, errno is not >> guaranteed to be useful, but in practice, it always is, >> so coreutils programs do rely on that. > > Right, and whether to check all writes is a > a trade off between maintainability and precise errors. > Checking every output function in addition to checking ferror on close > (as is done in close_stdout for example) will ensure _precise_ errors in > all edge cases, while just relying on ferror will always diagnose errors, > though in certain edge cases not with a precise error message. as you've seen. > Details at: > http://www.gnu.org/ghm/2011/paris/slides/jim-meyering-goodbye-world.pdf > > >>> I'll be able to troubleshoot and provide more information in couple of days. >>> A more elaborate solution is to save the errno after 'fwrite()' in a >>> variable, >>> and print that variable. I can send such a patch if that's a better idea. >> >> Yes, this sounds better. > > Yes the tradeoff discussed above is a bad one in csplit's case > as it only has a single fwrite() call. We should check the output > of that and call into cleanup_fatal() to exit early to avoid redundant > processing when one can't write the current output file. This could > be tested with something like: timeout 10 yes | csplit ... /dev/full
Actually it's a bit awkward to test. You might have to resort to tests using require_gcc_shared_ and wrap write() to return ENOSPC. cheers, Pádraig.
