On Mon, Oct 27, 2025 at 07:49:42PM +0100, Bruno Haible wrote:
> Hi Eric,
> 
> > I can see why an argument
> > can be made that -1 is not always an error return, and that when it is
> > not an error return that we should guarantee that the buf is
> > NUL-terminated.
> 
> Your wording "an argument can be made" is not strong enough, IMO.
> The sentences in
> <https://pubs.opengroup.org/onlinepubs/9799919799/functions/getdelim.html>
>   "The characters read, including any delimiter, shall be stored in the
>    object, and a terminating NUL added when the delimiter or end-of-file
>    is encountered."
> and
>   "If the end-of-file indicator for the stream is set, or if no characters
>    were read and the stream is at end-of-file, the end-of-file indicator
>    for the stream shall be set and the function shall return -1."
> *require*, unambiguously, that getdelim() operates like Collin implemented
> it now.

I still feel that the POSIX wording is ambiguous enough that it is
probably worth getting a clarification from the Austin Group.

> 
> > Pre-patch, it
> > was possible (although questionably portable, because the POSIX
> > wording is unclear) to call getline() in a loop until it returns -1,
> > and then have the contents of the final line of the file (assuming the
> > file was non-empty) in the buffer with no extra effort.  This works
> > great for grabbing the summary line of du(1), for example.
> 
> We generally agree that relying on implementation details that contradict
> the written specification is a bug, right? So, it has been a bug in nbdkit
> for at least since 2008.

Nbdkit isn't that old.  What you are arguing, however, is that the
usage pattern that nbdkit was employing (to access the last line)
which worked in glibc prior to 2008 when POSIX tried to standardize
the glibc behavior is not portable, and therefore nbdkit has been
buggy since its use of the broken paradigm, merely because POSIX
specified something different than glibc actually implemented.  And
that's why we should probably get the final consensus from the Austin
Group on whether the bug is in glibc or in POSIX.


> 
> > Is there any way to refine this patch in gnulib and glibc so as to not
> > break the behavior on non-empty files?  After all, if the file is
> > non-empty, then by the time we encounter EOF, we are guaranteed that
> > the buffer IS NUL-terminated from the previous loop.  It is only when
> > the file is empty that there was no previous line, and therefore no
> > guarantee of a NUL terminator in the buffer.  Would it work to change
> > the behavior to add a NUL terminator at the time the buffer is first
> > allocated before reading from the file, and then have EOF leave the
> > buffer unchanged, instead of truncating the buffer with \0 at buf[0]
> > on EOF?
> 
> Such a change would be a bad hack, because
>   - While it is common to call getline() is a loop, it is not a requirement.
>     Programs can very well mix getline() calls with other input operations
>     on the same stream.
>   - The specification of getdelim() and getline() is independent of the
>     file position. Whether the file position has been advanced by input
>     calls outside of the current process or by input operations in the
>     current process or not advanced at all, MUST NOT matter for getdelim()
>     and getline().
>     In other words, the change that you are proposing would make the
>     implementation *disagree* with the POSIX spec.
> 
> In Gnulib, our policy regarding backward-incompatible changes is to
> announce them in the NEWS file (something that Collin is welcome to do),
> and then let the callers update their code.
> 
> In glibc, the policy regarding backward-incompatible changes is to use
> symbol versioning (if the change is relevant enough). That is, provide
> two versions of getdelim and two versions of getline, in such a way that
>   - programs compiled against an older glibc get the old behaviour
>     even if running with a newer glibc,
>   - programs compiled against the newer glibc get the new behaviour.
> 
> I propose to follow these two policies here, rather than to introduce a
> hack that makes the implementation disagree with POSIX.

The fact that glibc has an observable behavioral difference in current
Fedora Rawhide means that glibc either needs a versioned symbol, or
needs to change behavior to avoid a regression - but which behavior to
take depends on whether the POSIX folks agree on which behavior is
intended by the stsandard.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


Reply via email to