Hi Collin, Collin Funk wrote: > Some context for the attached patch, since this discussion was ~2 > months ago. I made changes to getdelim/getline in glibc to null > terminate the buffer upon reading EOF [1]. At the time, this felt more > compliant with POSIX's description and protected callers from using > uninitialized memory when the given file stream was empty. > > Eric Blake alerted us that this broke a program which got the last line > of 'du' like this: > > while (getline (&line, &len, fp) != -1) > ; > /* Process LINE. */ > > The old glibc behavior would store the last line in LINE after the > loop. The new behavior would store a NUL in the first byte after the > loop. > > I did some testing and these platforms would have the last line in LINE > after the loop [3]: > > 1. Fedora 43 (glibc 2.42) > 2. AIX 7.1.1 and AIX 7.3.3 > 3. Alpine 3.19.8 (musl 1.2.4_git20230717) > 4. OpenBSD 7.7 > > These platforms would have a NUL in the first byte after the loop: > > 1. MacOS 12.6 > 2. Solaris 11 > 3. FreeBSD 14.3 > > POSIX made this behavior a bit more clearly undefined (defined as > undefined?), so that both implementations are obviously conferment > [4]. Specifically, by adding this line: > > If the return value is -1, the contents of *lineptr are > indeterminate. > > In glibc, we decided to null terminate the buffer only after > getdelim/getline does the initial allocation. That is, it protects you > from undefined behavior when FP is an empty stream here: > > char *line = NULL; > size_t len = 0; > ssize_t result = getline (&line, &len, fp); > /* LINE is safe to use. */ > > It does not protect you from undefined behavior when FP is an empty > stream here: > > char *line = malloc (1); > size_t len = 1; > ssize_t result = getline (&line, &len, fp); > /* LINE is uninitialized if FP is empty. */ > > Most of the time getdelim is used like in the first example, so > protecting the caller is nice there. It also does not break applications > which rely on the existing glibc behavior, i.e., being able to use the > last line of the file after getdelim returns -1. > > [1] > https://inbox.sourceware.org/libc-alpha/c66b228db6ee9c6ab54f3580c2d68ac39707aa59.1759979441.git.collin.fu...@gmail.com/ > [2] https://lists.gnu.org/archive/html/bug-gnulib/2025-10/msg00096.html > [3] https://inbox.sourceware.org/libc-alpha/[email protected]/ > [4] https://www.austingroupbugs.net/bug_view_page.php?bug_id=1953
Thanks for this summary of the (probably long) mail thread. > I've attached a (hopefully) mostly complete patch. It has been tested on > glibc 2.42 and glibc 2.43, and works as expected for both. That is, the > functions are replaced on glibc 2.42 and not on glibc 2.43. I disagree with the direction of this patch. Your previous commits from 2025-10-10 were based on the assumption that glibc 2.42 has a bug. But now that POSIX has relaxed the requirements [4], it's not a bug any more: all three behaviours are POSIX compliant. It's merely a portability problem. Since, additionally, this portability problem is not frequently encountered, and glibc 2.43 fixes only half of the "bug", I would suggest that - the modules 'getdelim', 'getline' guarantee only what POSIX guarantees, - there is no module 'getdelim-gnu'. So, getdelim.m4 should be changed so that all three implementations are accepted: - the one from glibc 2.42, musl, OpenBSD, AIX, - the one from macOS, FreeBSD, Solaris, - the one from glibc 2.43, and none of these system functions gets overridden. And the documentation should be changed like this: change getdelim.texi, moving This function does not NUL terminate the buffer when the first character read is EOF on some platforms: @c https://sourceware.org/PR28038 glibc 2.42. to the section "Portability problems not fixed by Gnulib". And getline.texi likewise. Bruno
