Hi,

what happens in fgetln(3) when the buffer runs out and reading more
bytes fails before a newline character is found?  There are two
cases:

 1. The stdio read buffer is empty at the time of the call,
    and the read(2) in __refill() fails:
    fgetln(3) returns NULL and sets errno and the stdio error
    indicator - even if it would have been possible to read
    fewer bytes than __refill() attempted.  So in this case,
    fgetln(3) does not attempt to return a partial line preceding
    a read error.

 2. The stdio read buffer is not empty at the time of the call
    or the first __refill() succeeds, but then a subsequent __refill()
    fails:
    fgetln(3) returns a buffer containing the concatenation of all
    strings that were returned by the __refill() calls before the
    failing one, but it sets errno and the stdio error indicator
    anyway.  There is a comment in the source code saying:
        /* EOF or error: return partial line */

This is problematic in two ways.

 1. The manual discusses exactly two cases: "successful completion"
    and "otherwise", the latter with subcases "end-of-file" and "error".
    It says that in the "error" case, errno and the stdio error
    indicator will be set, and that in the "end-of-file" case,
    the stdio error indicator will not be set.  That doesn't strictly
    imply that in the "success" case, errno and/or the stdio error
    indicator will remain untouched, but it is some indication that
    that is intended.  In any case, a library function that returns
    sucessfully yet sets errno and the stdio error indicator anyway
    is not behaving very nicely.

 2. It does not make sense to me to return partial lines read before
    a subsequent read error in one case, but discard the partial
    line in another case, in particular since the choice of behaviour
    and the point where the string will be cut off depends on libc
    internal data that the user doesn't control, specifically, on
    the size of the internal read buffer (sometimes BUFSIZ, sometimes
    something else), the number of bytes already read previously,
    and the position of the read pointer in the read buffer.  The
    string will almost always be cut before the last byte that could
    have been read, but the place of the cut must seem more or less
    random to the user.
    If it would really be important to return partial lines before
    read errors, reading would have to be done byte-by-byte.  But
    it doesn't seem important to me and i certainly think we should
    not do that.

So, to better match the manual, behave in a saner way, and be
consistent with itself, i think fgetln(3) should always fail and
return NULL when a read error occurs, even if it happened to be
possible to read a partial line before the read error.

The new code admittedly looks slightly awkward, but i didn't manage
to do better because __refill() encodes the return value from read(2)
in a way making it painful to recover.  Saving, inspecting, and
restoring errno would be about on par in terms of ugliness, but
minimally less logical and more fragile: What we really want to
know is whether __refill() detected a stdio read error, not whether
just anything whatsoever went wrong somewhere.

I found this minibug when discussing fgetwln(3) issues with Andrey
Chernov <ache at FreeBSD dot org>.  Once we have decided how fgetln(3)
ought to behave, i may have to do adjustments to fgetwln(3) as well.

I am well aware that we discourage using fgetln(3) in new code and
recommend the standard getline(3) instead, and that we even tend
to replace fgetln(3) with getline(3) when working on existing
codebases.  But on the one hand, as long as we don't delete a
function - and this one seems far from deletion to me - it should
be maintained as well as we can.  Besides, fgetwln(3) is still quite
relevant because i'm not aware of a getline(3) style replacement
for fgetwln(3), and as long as fgetwln(3) is needed, fgetln(3)
should certainly be maintained as well.

OK?
  Ingo


Index: fgetln.c
===================================================================
RCS file: /cvs/src/lib/libc/stdio/fgetln.c,v
retrieving revision 1.14
diff -u -p -r1.14 fgetln.c
--- fgetln.c    31 Aug 2015 02:53:57 -0000      1.14
+++ fgetln.c    24 Aug 2016 14:27:56 -0000
@@ -67,6 +67,7 @@ fgetln(FILE *fp, size_t *lenp)
        char *ret;
        size_t len;
        size_t off;
+       short saverr;
 
        FLOCKFILE(fp);
        _SET_ORIENTATION(fp, -1);
@@ -115,8 +116,15 @@ fgetln(FILE *fp, size_t *lenp)
                (void)memcpy((void *)(fp->_lb._base + off), (void *)fp->_p,
                    len - off);
                off = len;
-               if (__srefill(fp))
-                       break;  /* EOF or error: return partial line */
+               saverr = fp->_flags & __SERR;
+               fp->_flags &= ~__SERR;
+               if (__srefill(fp)) {
+                       if (fp->_flags & __SERR)
+                               goto error;
+                       fp->_flags |= saverr;
+                       break;  /* EOF: return last line */
+               }
+               fp->_flags |= saverr;
                if ((p = memchr((void *)fp->_p, '\n', fp->_r)) == NULL)
                        continue;
 

Reply via email to