Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Sun, Aug 13, 2017 at 07:51:34AM +0200, René Scharfe wrote: > Am 13.08.2017 um 06:32 schrieb Jeff King: > > On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote: > > > >> But we probably want to check for other errors. They look unlikely > >> enough that we may get away with something like this: > >> > >>- if (errno == ENOMEM) > >>- die("Out of memory, getdelim failed"); > >>+ if (errno || ferror(fp)) > >>+ die_errno(_("getdelim failed")); > >> > >> NB: The other errors are EINVAL (input pointers are NULL or the > >> stream is invalid) and EOVERFLOW (read more than fits into ssize_t) > >> according to POSIX and the Linux manpage. > > > > Can't we also get any of the errors that fgetc() would return. I.e., any > > normal read errors? We should return EOF on those, not die (and the > > caller can check ferror()). > > Yes, we can get those as well, and leaving error checking to the caller > is a flexible way to handle them. > > Many of the existing callers don't seem to be bother, though. Yes, but keep in mind that we need to match the non-getdelim() implementation here (which itself tries to behave like more or less like a dynamic fgets, including the eof/error handling). Because the callers don't know which one they're getting. -Peff
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
Am 13.08.2017 um 06:32 schrieb Jeff King: > On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote: > >> But we probably want to check for other errors. They look unlikely >> enough that we may get away with something like this: >> >> - if (errno == ENOMEM) >> - die("Out of memory, getdelim failed"); >> + if (errno || ferror(fp)) >> + die_errno(_("getdelim failed")); >> >> NB: The other errors are EINVAL (input pointers are NULL or the >> stream is invalid) and EOVERFLOW (read more than fits into ssize_t) >> according to POSIX and the Linux manpage. > > Can't we also get any of the errors that fgetc() would return. I.e., any > normal read errors? We should return EOF on those, not die (and the > caller can check ferror()). Yes, we can get those as well, and leaving error checking to the caller is a flexible way to handle them. Many of the existing callers don't seem to be bother, though. René
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote: > But we probably want to check for other errors. They look unlikely > enough that we may get away with something like this: > > - if (errno == ENOMEM) > - die("Out of memory, getdelim failed"); > + if (errno || ferror(fp)) > + die_errno(_("getdelim failed")); > > NB: The other errors are EINVAL (input pointers are NULL or the > stream is invalid) and EOVERFLOW (read more than fits into ssize_t) > according to POSIX and the Linux manpage. Can't we also get any of the errors that fgetc() would return. I.e., any normal read errors? We should return EOF on those, not die (and the caller can check ferror()). -Peff
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
Am 12.08.2017 um 13:57 schrieb René Scharfe: > Update: Just noticed that on the BSDs getdelim(3) doesn't set errno > to ENOMEM on allocation failure, but does set the error indicator. > That might be an oversight on their part, but that means we > certainly *need* to check ferror(). And an unchanged errno can > mean ENOMEM there. Ugh.. I take that back. The memory allocation function called by getdelim will of course set errno on failure. I got fooled by the manpages, which don't mention ENOMEM, e.g.: https://man.openbsd.org/getdelim. René
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
Am 12.08.2017 um 12:02 schrieb Simon Ruderich: > On Fri, Aug 11, 2017 at 10:52:51AM +0200, René Scharfe wrote: >> Am 11.08.2017 um 09:50 schrieb Simon Ruderich: >>> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: getdelim(3) returns -1 at the end of the file and if it encounters an error, but sets errno only in the latter case. Set errno to zero before calling it to avoid misdiagnosing an out-of-memory condition due to a left-over value from some other function call. >>> >>> getdelim(3p) doesn't explicitly forbid changing the errno on EOF: >>> >>> If no characters were read, and the end-of-file indicator for >>> the stream is set, or if the stream is at end-of-file, the >>> end-of-file indicator for the stream shall be set and the >>> function shall return −1. If an error occurs, the error >>> indicator for the stream shall be set, and the function shall >>> return −1 and set errno to indicate the error. >> >> [snip] >> >>> I don't think that it matters in practice, but the "most" correct >>> way to handle this would be to check if feof(3) is true to check >>> for the non-errno case. >> >> Only if errors at EOF are guaranteed to be impossible. Imagine >> getdelim being able to read to the end and then failing to get memory >> for the final NUL. Other scenarios may be possible. > > Good point. Instead of feof(3), checking ferror(3) should handle > that properly. It's guaranteed to be set by getdelim for any > error. Only if we have a POSIXly correct implementation of getdelim(3). On Linux its manpage doesn't mention the error indicator, but says that the function started out as a GNU extension before becoming standardized, so I'm not sure we can depend on that everywhere. But we probably want to check for other errors. They look unlikely enough that we may get away with something like this: - if (errno == ENOMEM) - die("Out of memory, getdelim failed"); + if (errno || ferror(fp)) + die_errno(_("getdelim failed")); NB: The other errors are EINVAL (input pointers are NULL or the stream is invalid) and EOVERFLOW (read more than fits into ssize_t) according to POSIX and the Linux manpage. Update: Just noticed that on the BSDs getdelim(3) doesn't set errno to ENOMEM on allocation failure, but does set the error indicator. That might be an oversight on their part, but that means we certainly *need* to check ferror(). And an unchanged errno can mean ENOMEM there. Ugh.. > An edge case is if the error indicator was already set before > calling getdelim() and the errno was already set to ENOMEM. This > could be fixed by checking ferror() before calling getdelim, but > I'm not sure if that's necessary: Treating an error as end-of-file is not a good idea. An invalid stream should not be passed to strbuf_getwholeline() in the first place. We could call die() or BUG(), but I think we may leave that check to getdelim(3), unless we learn about an implementation that is unprepared to reject streams with the error indicator set. René
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Fri, Aug 11, 2017 at 10:52:51AM +0200, René Scharfe wrote: > Am 11.08.2017 um 09:50 schrieb Simon Ruderich: >> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: >>> getdelim(3) returns -1 at the end of the file and if it encounters an >>> error, but sets errno only in the latter case. Set errno to zero before >>> calling it to avoid misdiagnosing an out-of-memory condition due to a >>> left-over value from some other function call. >> >> getdelim(3p) doesn't explicitly forbid changing the errno on EOF: >> >> If no characters were read, and the end-of-file indicator for >> the stream is set, or if the stream is at end-of-file, the >> end-of-file indicator for the stream shall be set and the >> function shall return −1. If an error occurs, the error >> indicator for the stream shall be set, and the function shall >> return −1 and set errno to indicate the error. > > [snip] > >> I don't think that it matters in practice, but the "most" correct >> way to handle this would be to check if feof(3) is true to check >> for the non-errno case. > > Only if errors at EOF are guaranteed to be impossible. Imagine > getdelim being able to read to the end and then failing to get memory > for the final NUL. Other scenarios may be possible. Good point. Instead of feof(3), checking ferror(3) should handle that properly. It's guaranteed to be set by getdelim for any error. For example like this (as replacement for the original patch): diff --git i/strbuf.c w/strbuf.c index 89d22e3b0..831be21ce 100644 --- i/strbuf.c +++ w/strbuf.c @@ -495,7 +495,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) * memory and retry, but that's unlikely to help for a malloc small * enough to hold a single line of input, anyway. */ - if (errno == ENOMEM) + if (ferror(fp) && errno == ENOMEM) die("Out of memory, getdelim failed"); /* An edge case is if the error indicator was already set before calling getdelim() and the errno was already set to ENOMEM. This could be fixed by checking ferror() before calling getdelim, but I'm not sure if that's necessary: diff --git i/strbuf.c w/strbuf.c index 89d22e3b0..4d394bb91 100644 --- i/strbuf.c +++ w/strbuf.c @@ -468,7 +468,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) { ssize_t r; - if (feof(fp)) + if (feof(fp) || ferror(fp)) return EOF; strbuf_reset(sb); Regards Simon -- + Privatsphäre ist notwendig + Ich verwende GnuPG http://gnupg.org + Öffentlicher Schlüssel: 0x92FEFDB7E44C32F9
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
Am 11.08.2017 um 09:50 schrieb Simon Ruderich: > On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: >> getdelim(3) returns -1 at the end of the file and if it encounters an >> error, but sets errno only in the latter case. Set errno to zero before >> calling it to avoid misdiagnosing an out-of-memory condition due to a >> left-over value from some other function call. > > getdelim(3p) doesn't explicitly forbid changing the errno on EOF: > > If no characters were read, and the end-of-file indicator for > the stream is set, or if the stream is at end-of-file, the > end-of-file indicator for the stream shall be set and the > function shall return −1. If an error occurs, the error > indicator for the stream shall be set, and the function shall > return −1 and set errno to indicate the error. > > So a valid implementation could still set errno on EOF and also > on another error (where it's required to set errno). True, especially the part that other errors are possible. But we can't rely on errno being set on EOF because leaving it unchanged is allowed as well in that case. > I don't think that it matters in practice, but the "most" correct > way to handle this would be to check if feof(3) is true to check > for the non-errno case. Only if errors at EOF are guaranteed to be impossible. Imagine getdelim being able to read to the end and then failing to get memory for the final NUL. Other scenarios may be possible. René
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > getdelim(3) returns -1 at the end of the file and if it encounters an > error, but sets errno only in the latter case. Set errno to zero before > calling it to avoid misdiagnosing an out-of-memory condition due to a > left-over value from some other function call. getdelim(3p) doesn't explicitly forbid changing the errno on EOF: If no characters were read, and the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream shall be set and the function shall return −1. If an error occurs, the error indicator for the stream shall be set, and the function shall return −1 and set errno to indicate the error. So a valid implementation could still set errno on EOF and also on another error (where it's required to set errno). I don't think that it matters in practice, but the "most" correct way to handle this would be to check if feof(3) is true to check for the non-errno case. Regards Simon -- + privacy is necessary + using gnupg http://gnupg.org + public key id: 0x92FEFDB7E44C32F9
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
René Scharfewrites: > getdelim(3) returns -1 at the end of the file and if it encounters an > error, but sets errno only in the latter case. Set errno to zero before > calling it to avoid misdiagnosing an out-of-memory condition due to a > left-over value from some other function call. > > Reported-by: Yaroslav Halchenko > Suggested-by: Junio C Hamano > Signed-off-by: Rene Scharfe Heh. I mumble something vague then people more capable than me jump in to take it to the conclusion, and still I get the credit. I wish all the debugging sessions were this easy ;-) > --- > Do we need to save and restore the original value of errno? I doubt it, > but didn't think deeply about it, yet. We probably don't need to---a caller who knows it got an error before calling this function and wants to use errno after doing so should be stashing it away; after all, this function will clobber errno when any of the library calls it makes fails and this is on the I/O codepath, so anything can go wrong. Thanks. > > strbuf.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/strbuf.c b/strbuf.c > index 89d22e3b09..323c49ceb3 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int > term) > /* Translate slopbuf to NULL, as we cannot call realloc on it */ > if (!sb->alloc) > sb->buf = NULL; > + errno = 0; > r = getdelim(>buf, >alloc, term, fp); > > if (r > 0) {
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, 10 Aug 2017, Jeff King wrote: > On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > > getdelim(3) returns -1 at the end of the file and if it encounters an > > error, but sets errno only in the latter case. Set errno to zero before > > calling it to avoid misdiagnosing an out-of-memory condition due to a > > left-over value from some other function call. > Looks good to me. > > Do we need to save and restore the original value of errno? I doubt it, > > but didn't think deeply about it, yet. > I'd say no. Anybody depending on strbuf_getwholeline() is clearly > already wrong in the error case. And in general I think we assume that > syscalls can clear errno on success if they choose to (this isn't a > syscall, but obviously it is calling some). Shouldn't ideally errno being reset to 0 upon check of the syscall successful run right after that syscall? (I also see some spots within git code where it sets errno to ENOMEM) -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > getdelim(3) returns -1 at the end of the file and if it encounters an > error, but sets errno only in the latter case. Set errno to zero before > calling it to avoid misdiagnosing an out-of-memory condition due to a > left-over value from some other function call. Looks good to me. > Do we need to save and restore the original value of errno? I doubt it, > but didn't think deeply about it, yet. I'd say no. Anybody depending on strbuf_getwholeline() is clearly already wrong in the error case. And in general I think we assume that syscalls can clear errno on success if they choose to (this isn't a syscall, but obviously it is calling some). -Peff
[PATCH] strbuf: clear errno before calling getdelim(3)
getdelim(3) returns -1 at the end of the file and if it encounters an error, but sets errno only in the latter case. Set errno to zero before calling it to avoid misdiagnosing an out-of-memory condition due to a left-over value from some other function call. Reported-by: Yaroslav HalchenkoSuggested-by: Junio C Hamano Signed-off-by: Rene Scharfe --- Do we need to save and restore the original value of errno? I doubt it, but didn't think deeply about it, yet. strbuf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/strbuf.c b/strbuf.c index 89d22e3b09..323c49ceb3 100644 --- a/strbuf.c +++ b/strbuf.c @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) /* Translate slopbuf to NULL, as we cannot call realloc on it */ if (!sb->alloc) sb->buf = NULL; + errno = 0; r = getdelim(>buf, >alloc, term, fp); if (r > 0) { -- 2.14.0