Hi Collin,

Collin Funk wrote:
> Some context for the attached patch, since this discussion was ~2
> months ago. I made changes to getdelim/getline in glibc to null
> terminate the buffer upon reading EOF [1]. At the time, this felt more
> compliant with POSIX's description and protected callers from using
> uninitialized memory when the given file stream was empty.
> 
> Eric Blake alerted us that this broke a program which got the last line
> of 'du' like this:
> 
>         while (getline (&line, &len, fp) != -1)
>           ;
>         /* Process LINE.  */
> 
> The old glibc behavior would store the last line in LINE after the
> loop. The new behavior would store a NUL in the first byte after the
> loop.
> 
> I did some testing and these platforms would have the last line in LINE
> after the loop [3]:
> 
>     1. Fedora 43 (glibc 2.42)
>     2. AIX 7.1.1 and AIX 7.3.3
>     3. Alpine 3.19.8 (musl 1.2.4_git20230717)
>     4. OpenBSD 7.7
> 
> These platforms would have a NUL in the first byte after the loop:
> 
>     1. MacOS 12.6
>     2. Solaris 11
>     3. FreeBSD 14.3
> 
> POSIX made this behavior a bit more clearly undefined (defined as
> undefined?), so that both implementations are obviously conferment
> [4]. Specifically, by adding this line:
> 
>     If the return value is -1, the contents of *lineptr are
>     indeterminate.
> 
> In glibc, we decided to null terminate the buffer only after
> getdelim/getline does the initial allocation. That is, it protects you
> from undefined behavior when FP is an empty stream here:
> 
>     char *line = NULL;
>     size_t len = 0;
>     ssize_t result = getline (&line, &len, fp);
>     /* LINE is safe to use.  */
> 
> It does not protect you from undefined behavior when FP is an empty
> stream here:
> 
>     char *line = malloc (1);
>     size_t len = 1;
>     ssize_t result = getline (&line, &len, fp);
>     /* LINE is uninitialized if FP is empty.  */
> 
> Most of the time getdelim is used like in the first example, so
> protecting the caller is nice there. It also does not break applications
> which rely on the existing glibc behavior, i.e., being able to use the
> last line of the file after getdelim returns -1.
>
> [1] 
> https://inbox.sourceware.org/libc-alpha/c66b228db6ee9c6ab54f3580c2d68ac39707aa59.1759979441.git.collin.fu...@gmail.com/
> [2] https://lists.gnu.org/archive/html/bug-gnulib/2025-10/msg00096.html
> [3] https://inbox.sourceware.org/libc-alpha/[email protected]/
> [4] https://www.austingroupbugs.net/bug_view_page.php?bug_id=1953

Thanks for this summary of the (probably long) mail thread.

> I've attached a (hopefully) mostly complete patch. It has been tested on
> glibc 2.42 and glibc 2.43, and  works as expected for both. That is, the
> functions are replaced on glibc 2.42 and not on glibc 2.43.

I disagree with the direction of this patch. Your previous commits from
2025-10-10 were based on the assumption that glibc 2.42 has a bug.
But now that POSIX has relaxed the requirements [4], it's not a bug any
more: all three behaviours are POSIX compliant. It's merely a portability
problem.

Since, additionally, this portability problem is not frequently encountered,
and glibc 2.43 fixes only half of the "bug", I would suggest that
  - the modules 'getdelim', 'getline' guarantee only what POSIX guarantees,
  - there is no module 'getdelim-gnu'.

So, getdelim.m4 should be changed so that all three implementations are
accepted:
  - the one from glibc 2.42, musl, OpenBSD, AIX,
  - the one from macOS, FreeBSD, Solaris,
  - the one from glibc 2.43,
and none of these system functions gets overridden.

And the documentation should be changed like this: change getdelim.texi,
moving

  This function does not NUL terminate the buffer when the first
  character read is EOF on some platforms:
  @c https://sourceware.org/PR28038
  glibc 2.42.

to the section "Portability problems not fixed by Gnulib". And getline.texi
likewise.

Bruno




Reply via email to