Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-12 Thread Jeff King
On Sun, Aug 13, 2017 at 07:51:34AM +0200, René Scharfe wrote:

> Am 13.08.2017 um 06:32 schrieb Jeff King:
> > On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote:
> > 
> >> But we probably want to check for other errors.  They look unlikely
> >> enough that we may get away with something like this:
> >>
> >>-   if (errno == ENOMEM)
> >>-   die("Out of memory, getdelim failed");
> >>+   if (errno || ferror(fp))
> >>+   die_errno(_("getdelim failed"));
> >>
> >> NB: The other errors are EINVAL (input pointers are NULL or the
> >> stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
> >> according to POSIX and the Linux manpage.
> > 
> > Can't we also get any of the errors that fgetc() would return. I.e., any
> > normal read errors? We should return EOF on those, not die (and the
> > caller can check ferror()).
> 
> Yes, we can get those as well, and leaving error checking to the caller
> is a flexible way to handle them.  
> 
> Many of the existing callers don't seem to be bother, though.

Yes, but keep in mind that we need to match the non-getdelim()
implementation here (which itself tries to behave like more or less like
a dynamic fgets, including the eof/error handling). Because the callers
don't know which one they're getting.

-Peff


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-12 Thread René Scharfe
Am 13.08.2017 um 06:32 schrieb Jeff King:
> On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote:
> 
>> But we probably want to check for other errors.  They look unlikely
>> enough that we may get away with something like this:
>>
>>  -   if (errno == ENOMEM)
>>  -   die("Out of memory, getdelim failed");
>>  +   if (errno || ferror(fp))
>>  +   die_errno(_("getdelim failed"));
>>
>> NB: The other errors are EINVAL (input pointers are NULL or the
>> stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
>> according to POSIX and the Linux manpage.
> 
> Can't we also get any of the errors that fgetc() would return. I.e., any
> normal read errors? We should return EOF on those, not die (and the
> caller can check ferror()).

Yes, we can get those as well, and leaving error checking to the caller
is a flexible way to handle them.  

Many of the existing callers don't seem to be bother, though.

René


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-12 Thread Jeff King
On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote:

> But we probably want to check for other errors.  They look unlikely
> enough that we may get away with something like this:
> 
>   -   if (errno == ENOMEM)
>   -   die("Out of memory, getdelim failed");
>   +   if (errno || ferror(fp))
>   +   die_errno(_("getdelim failed"));
> 
> NB: The other errors are EINVAL (input pointers are NULL or the
> stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
> according to POSIX and the Linux manpage.

Can't we also get any of the errors that fgetc() would return. I.e., any
normal read errors? We should return EOF on those, not die (and the
caller can check ferror()).

-Peff


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-12 Thread René Scharfe
Am 12.08.2017 um 13:57 schrieb René Scharfe:
> Update: Just noticed that on the BSDs getdelim(3) doesn't set errno
> to ENOMEM on allocation failure, but does set the error indicator.
> That might be an oversight on their part, but that means we
> certainly *need* to check ferror().  And an unchanged errno can
> mean ENOMEM there.  Ugh..

I take that back.  The memory allocation function called by getdelim
will of course set errno on failure.  I got fooled by the manpages,
which don't mention ENOMEM, e.g.: https://man.openbsd.org/getdelim.

René


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-12 Thread René Scharfe
Am 12.08.2017 um 12:02 schrieb Simon Ruderich:
> On Fri, Aug 11, 2017 at 10:52:51AM +0200, René Scharfe wrote:
>> Am 11.08.2017 um 09:50 schrieb Simon Ruderich:
>>> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
 getdelim(3) returns -1 at the end of the file and if it encounters an
 error, but sets errno only in the latter case.  Set errno to zero before
 calling it to avoid misdiagnosing an out-of-memory condition due to a
 left-over value from some other function call.
>>>
>>> getdelim(3p) doesn't explicitly forbid changing the errno on EOF:
>>>
>>>   If no characters were read, and the end-of-file indicator for
>>>   the stream is set, or if the stream is at end-of-file, the
>>>   end-of-file indicator for the stream shall be set and the
>>>   function shall return −1. If an error occurs, the error
>>>   indicator for the stream shall be set, and the function shall
>>>   return −1 and set errno to indicate the error.
>>
>> [snip]
>>
>>> I don't think that it matters in practice, but the "most" correct
>>> way to handle this would be to check if feof(3) is true to check
>>> for the non-errno case.
>>
>> Only if errors at EOF are guaranteed to be impossible.  Imagine
>> getdelim being able to read to the end and then failing to get memory
>> for the final NUL.  Other scenarios may be possible.
> 
> Good point. Instead of feof(3), checking ferror(3) should handle
> that properly. It's guaranteed to be set by getdelim for any
> error.

Only if we have a POSIXly correct implementation of getdelim(3). On
Linux its manpage doesn't mention the error indicator, but says that
the function started out as a GNU extension before becoming
standardized, so I'm not sure we can depend on that everywhere.

But we probably want to check for other errors.  They look unlikely
enough that we may get away with something like this:

-   if (errno == ENOMEM)
-   die("Out of memory, getdelim failed");
+   if (errno || ferror(fp))
+   die_errno(_("getdelim failed"));

NB: The other errors are EINVAL (input pointers are NULL or the
stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
according to POSIX and the Linux manpage.

Update: Just noticed that on the BSDs getdelim(3) doesn't set errno
to ENOMEM on allocation failure, but does set the error indicator.
That might be an oversight on their part, but that means we
certainly *need* to check ferror().  And an unchanged errno can
mean ENOMEM there.  Ugh..

> An edge case is if the error indicator was already set before
> calling getdelim() and the errno was already set to ENOMEM. This
> could be fixed by checking ferror() before calling getdelim, but
> I'm not sure if that's necessary:

Treating an error as end-of-file is not a good idea.  An invalid
stream should not be passed to strbuf_getwholeline() in the first
place.  We could call die() or BUG(), but I think we may leave
that check to getdelim(3), unless we learn about an implementation
that is unprepared to reject streams with the error indicator set.

René


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-12 Thread Simon Ruderich
On Fri, Aug 11, 2017 at 10:52:51AM +0200, René Scharfe wrote:
> Am 11.08.2017 um 09:50 schrieb Simon Ruderich:
>> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
>>> getdelim(3) returns -1 at the end of the file and if it encounters an
>>> error, but sets errno only in the latter case.  Set errno to zero before
>>> calling it to avoid misdiagnosing an out-of-memory condition due to a
>>> left-over value from some other function call.
>>
>> getdelim(3p) doesn't explicitly forbid changing the errno on EOF:
>>
>>  If no characters were read, and the end-of-file indicator for
>>  the stream is set, or if the stream is at end-of-file, the
>>  end-of-file indicator for the stream shall be set and the
>>  function shall return −1. If an error occurs, the error
>>  indicator for the stream shall be set, and the function shall
>>  return −1 and set errno to indicate the error.
>
> [snip]
>
>> I don't think that it matters in practice, but the "most" correct
>> way to handle this would be to check if feof(3) is true to check
>> for the non-errno case.
>
> Only if errors at EOF are guaranteed to be impossible.  Imagine
> getdelim being able to read to the end and then failing to get memory
> for the final NUL.  Other scenarios may be possible.

Good point. Instead of feof(3), checking ferror(3) should handle
that properly. It's guaranteed to be set by getdelim for any
error.

For example like this (as replacement for the original patch):

diff --git i/strbuf.c w/strbuf.c
index 89d22e3b0..831be21ce 100644
--- i/strbuf.c
+++ w/strbuf.c
@@ -495,7 +495,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
term)
 * memory and retry, but that's unlikely to help for a malloc small
 * enough to hold a single line of input, anyway.
 */
-   if (errno == ENOMEM)
+   if (ferror(fp) && errno == ENOMEM)
die("Out of memory, getdelim failed");
 
/*

An edge case is if the error indicator was already set before
calling getdelim() and the errno was already set to ENOMEM. This
could be fixed by checking ferror() before calling getdelim, but
I'm not sure if that's necessary:

diff --git i/strbuf.c w/strbuf.c
index 89d22e3b0..4d394bb91 100644
--- i/strbuf.c
+++ w/strbuf.c
@@ -468,7 +468,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
term)
 {
ssize_t r;
 
-   if (feof(fp))
+   if (feof(fp) || ferror(fp))
return EOF;
 
strbuf_reset(sb);

Regards
Simon
-- 
+ Privatsphäre ist notwendig
+ Ich verwende GnuPG http://gnupg.org
+ Öffentlicher Schlüssel: 0x92FEFDB7E44C32F9


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-11 Thread René Scharfe
Am 11.08.2017 um 09:50 schrieb Simon Ruderich:
> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
>> getdelim(3) returns -1 at the end of the file and if it encounters an
>> error, but sets errno only in the latter case.  Set errno to zero before
>> calling it to avoid misdiagnosing an out-of-memory condition due to a
>> left-over value from some other function call.
> 
> getdelim(3p) doesn't explicitly forbid changing the errno on EOF:
> 
>  If no characters were read, and the end-of-file indicator for
>  the stream is set, or if the stream is at end-of-file, the
>  end-of-file indicator for the stream shall be set and the
>  function shall return −1. If an error occurs, the error
>  indicator for the stream shall be set, and the function shall
>  return −1 and set errno to indicate the error.
> 
> So a valid implementation could still set errno on EOF and also
> on another error (where it's required to set errno).

True, especially the part that other errors are possible.  But we can't
rely on errno being set on EOF because leaving it unchanged is allowed
as well in that
 case.

> I don't think that it matters in practice, but the "most" correct
> way to handle this would be to check if feof(3) is true to check
> for the non-errno case.

Only if errors at EOF are guaranteed to be impossible.  Imagine
getdelim being able to read to the end and then failing to get memory
for the final NUL.  Other scenarios may be possible.

René


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-11 Thread Simon Ruderich
On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
> getdelim(3) returns -1 at the end of the file and if it encounters an
> error, but sets errno only in the latter case.  Set errno to zero before
> calling it to avoid misdiagnosing an out-of-memory condition due to a
> left-over value from some other function call.

getdelim(3p) doesn't explicitly forbid changing the errno on EOF:

If no characters were read, and the end-of-file indicator for
the stream is set, or if the stream is at end-of-file, the
end-of-file indicator for the stream shall be set and the
function shall return −1. If an error occurs, the error
indicator for the stream shall be set, and the function shall
return −1 and set errno to indicate the error.

So a valid implementation could still set errno on EOF and also
on another error (where it's required to set errno).

I don't think that it matters in practice, but the "most" correct
way to handle this would be to check if feof(3) is true to check
for the non-errno case.

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-10 Thread Junio C Hamano
René Scharfe  writes:

> getdelim(3) returns -1 at the end of the file and if it encounters an
> error, but sets errno only in the latter case.  Set errno to zero before
> calling it to avoid misdiagnosing an out-of-memory condition due to a
> left-over value from some other function call.
>
> Reported-by: Yaroslav Halchenko 
> Suggested-by: Junio C Hamano 
> Signed-off-by: Rene Scharfe 

Heh.  I mumble something vague then people more capable than me jump
in to take it to the conclusion, and still I get the credit.  I wish
all the debugging sessions were this easy ;-)

> ---
> Do we need to save and restore the original value of errno?  I doubt it,
> but didn't think deeply about it, yet.

We probably don't need to---a caller who knows it got an error
before calling this function and wants to use errno after doing so
should be stashing it away; after all, this function will clobber
errno when any of the library calls it makes fails and this is on
the I/O codepath, so anything can go wrong.

Thanks.

>
>  strbuf.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/strbuf.c b/strbuf.c
> index 89d22e3b09..323c49ceb3 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
> term)
>   /* Translate slopbuf to NULL, as we cannot call realloc on it */
>   if (!sb->alloc)
>   sb->buf = NULL;
> + errno = 0;
>   r = getdelim(>buf, >alloc, term, fp);
>  
>   if (r > 0) {


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-10 Thread Yaroslav Halchenko

On Thu, 10 Aug 2017, Jeff King wrote:

> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:

> > getdelim(3) returns -1 at the end of the file and if it encounters an
> > error, but sets errno only in the latter case.  Set errno to zero before
> > calling it to avoid misdiagnosing an out-of-memory condition due to a
> > left-over value from some other function call.

> Looks good to me.

> > Do we need to save and restore the original value of errno?  I doubt it,
> > but didn't think deeply about it, yet.

> I'd say no. Anybody depending on strbuf_getwholeline() is clearly
> already wrong in the error case. And in general I think we assume that
> syscalls can clear errno on success if they choose to (this isn't a
> syscall, but obviously it is calling some).

Shouldn't ideally errno being reset to 0 upon check of the syscall
successful run right after that syscall?  (I also see some spots within
git code where it sets errno to ENOMEM)


-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik


Re: [PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-10 Thread Jeff King
On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:

> getdelim(3) returns -1 at the end of the file and if it encounters an
> error, but sets errno only in the latter case.  Set errno to zero before
> calling it to avoid misdiagnosing an out-of-memory condition due to a
> left-over value from some other function call.

Looks good to me.

> Do we need to save and restore the original value of errno?  I doubt it,
> but didn't think deeply about it, yet.

I'd say no. Anybody depending on strbuf_getwholeline() is clearly
already wrong in the error case. And in general I think we assume that
syscalls can clear errno on success if they choose to (this isn't a
syscall, but obviously it is calling some).

-Peff


[PATCH] strbuf: clear errno before calling getdelim(3)

2017-08-10 Thread René Scharfe
getdelim(3) returns -1 at the end of the file and if it encounters an
error, but sets errno only in the latter case.  Set errno to zero before
calling it to avoid misdiagnosing an out-of-memory condition due to a
left-over value from some other function call.

Reported-by: Yaroslav Halchenko 
Suggested-by: Junio C Hamano 
Signed-off-by: Rene Scharfe 
---
Do we need to save and restore the original value of errno?  I doubt it,
but didn't think deeply about it, yet.

 strbuf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/strbuf.c b/strbuf.c
index 89d22e3b09..323c49ceb3 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
term)
/* Translate slopbuf to NULL, as we cannot call realloc on it */
if (!sb->alloc)
sb->buf = NULL;
+   errno = 0;
r = getdelim(>buf, >alloc, term, fp);
 
if (r > 0) {
-- 
2.14.0