Possibly fgetln(3) CAVEATS could benefit from some of what you have
written here, although it is all in the manual already just in several
places.

Or at least a link to getline(3)?



On Sun, Jul 02, 2017 at 12:26:19AM +0200, Ingo Schwarze wrote:
> Hi,
> 
> Nicholas Marriott wrote on Sat, Jul 01, 2017 at 09:48:08PM +0100:
> 
> > Safer how?
> 
>  1. fgetln(3) returns a char array that is not NUL-terminated.
>     That is prone to buffer overrun bugs in general.
> 
>     getline(3) always returns proper NUL-terminated C strings.
> 
>  2. The array returned from fgetln(3) contains a trailing '\n' in
>     most cases, but if the stream ends without it, there won't be
>     one, which aggravates the danger of item 1: if a careless
>     programmer reads the buffer until '\n', they will get away with
>     it on all valid text input files, so the problem is likely to
>     go undetected during testing, but they may yet be vulnerable
>     to buffer overrun on crafted input lacking the trailing newline.
> 
>     With getline(3), lack of the trailing '\n' is much less of an
>     issue because people will likely look for the terminating '\0'
>     in the first place, not for the terminating '\n'.
> 
>  3. For this reason, the fgetln(3) manual page contains a lengthy
>     CAVEATS section recommending a complicated ten-line idiom to
>     NUL-terminate the buffer.  In general, if something requires
>     a ten-line idiom, it can hardly be called safe.  Adding the
>     missing two lines for error checking at the end, the complete
>     example is essentially 19 lines.
> 
>     EXAMPLES in fgetline(3) contains a code snipped doing about the
>     same, and even adding two more lines to strip '\n', it is only
>     10 lines long including full initialization and error checking.
>     So that's half the code, much more intuitive, and much simpler,
>     without any manual malloc(3) or memcpy(3) or manually assigning
>     '\0' to some position calculated from pointer arithmetics,
>     which we all know is prone to off-by-one.
> 
>  4. With fgetln(3), the returned pointer becomes invalid after the
>     next I/O operation on the stream.  So you say line = fgetln(..),
>     do something else that you consider unrelated because you never
>     use "line" in it, and the next time you access "line", boom.
>     That can be very surprising.  In general, library interfaces
>     passing out pointers to library-owned internal memory for the
>     user to mess with tend to be somewhat risky.
> 
>     By contrast, getline(3) copies the data to a dedicated buffer,
>     which won't become invalid on I/O.  So the next time you use
>     it, it is guaranteed to be still valid, even if that is much
>     later and the file in question has long been closed.  Of course,
>     if you pass the same buffer to getline(3) again, then after
>     that, it will contain the *next* line; but that is not surprising
>     at all, because that is exactly what you explicitly asked for.
> 
>  5. The fgetln(3) manual says that you are allowed to modify the
>     returned buffer, but item 4 still applies, which aggravates the
>     danger inherent in item 4.  Why would you modify the buffer,
>     unless you want to do something else after that and then, yet
>     later, look at your changes again?  But then you must be *very*
>     careful to not do I/O on that file descriptor while doing
>     the "something else", or you are in for a surprise.
> 
>     getline(3) returns an allocated buffer for the user to free(3).
>     That's an easy to understand operation mode with clearly
>     assigned responsibilies, and it is so obvious that the user
>     can alter the buffer - after all, it's their own buffer -
>     that the manual doesn't even need to mention it.
> 
> Your code follows the CAVEATS section in the fgetln(3) manual quite
> closely, and i would have been quite surprised if you had screwed
> such a thing up.
> 
> But with most other people, i would have felt somewhat worried
> about their code if they had asked me "safer how?" *after* using
> fgetln(3)...  >:-o
> 
> > I don't mind too much, but you will also need to initialize size
> > to 0 (it is currently uninitialized) before calling getline().
> 
> Ouch.  No interface can be so simple that there is no way to
> misuse it dangerously, even if it sets fewer traps for the
> unwary than fgetln(3).
> 
> >> Index: magic-load.c
> >> ===================================================================
> >> RCS file: /cvs/src/usr.bin/file/magic-load.c,v
> >> retrieving revision 1.25
> >> diff -u -p -r1.25 magic-load.c
> >> --- magic-load.c   1 Jul 2017 14:34:29 -0000       1.25
> >> +++ magic-load.c   1 Jul 2017 18:44:42 -0000
> >> @@ -1071,6 +1071,7 @@ magic_load(FILE *f, const char *path, in
> >>    struct magic_line       *ml = NULL, *parent, *parent0;
> >>    char                    *line, *tmp;
> >>    size_t                   size;
> >> +  ssize_t                  slen;
> >>    u_int                    at, level, n, i;
> >>  
> >>    m = xcalloc(1, sizeof *m);
> >> @@ -1084,15 +1085,11 @@ magic_load(FILE *f, const char *path, in
> >>  
> >>    at = 0;
> >>    tmp = NULL;
> >> -  while ((line = fgetln(f, &size))) {
> 
> Hmmm, the current code does not test ferror(3) after falling out
> of the "while" loop - is it intentional to ignore I/O errors?
> In that case, an explicit comment "ignore I/O errors" after
> the end of the while loop might make sense.
> 
> >> -          if (line[size - 1] == '\n')
> >> -                  line[size - 1] = '\0';
> >> -          else {
> >> -                  tmp = xmalloc(size + 1);
> >> -                  memcpy(tmp, line, size);
> >> -                  tmp[size] = '\0';
> >> -                  line = tmp;
> >> -          }
> >> +  while ((slen = getline(&tmp, &size, f)) != -1) {
> >> +          line = tmp;
> 
> I don't see a need for two char * variables here, that's an artifact
> of the typical fgetln(3) complications.  Just
> 
>       char    *line = NULL;
>       size_t   linesize = 0;
>       ssize_t  slen;
> 
>       while ((slen = getline(&line, &linesize, f)) != -1) {
>               if (line[slen - 1] == '\n')
>                       /* etc. */
>       }
>       free(line);
>       if (ferror(f))
>               err(1, "%s", path);
> 
> ought to suffice.
> 
> Yours,
>   Ingo
> 
> >> +          if (line[slen - 1] == '\n')
> >> +                  line[slen - 1] = '\0';
> >> +
> >>            at++;
> >>  
> >>            while (isspace((u_char)*line))

Reply via email to