Hi Eric, Eric Blake <[email protected]> writes:
> In this particular case, there is at least one project that observably > behaves differently due to the glibc change, and where the workaround > in that project was to add a strndup() after each getline() call, for > double the malloc() pressure: > > https://gitlab.com/nbdkit/nbdkit/-/commit/01b8e557ce129b > > In short, while your patch changes the behavior on an empty file to > guarantee that the buffer is NUL-terminated even though it is empty > (on the grounds that the -1 return in THAT scenario is not an error, > per se, because errno is not set), it ALSO has the side effect of > changing the buffer on a non-empty file after the last line is already > in the buffer and EOF then encountered to end the loop. Pre-patch, it > was possible (although questionably portable, because the POSIX > wording is unclear) to call getline() in a loop until it returns -1, > and then have the contents of the final line of the file (assuming the > file was non-empty) in the buffer with no extra effort. This works > great for grabbing the summary line of du(1), for example. > Post-patch, glibc now ALWAYS writes buf[0] to \0 on EOF, even if the > file was non-empty, which breaks that convenience means of grabbing > the last line of du. Oops... Silly mistake on my part. Thanks for pointing it out. > Is there any way to refine this patch in gnulib and glibc so as to not > break the behavior on non-empty files? After all, if the file is > non-empty, then by the time we encounter EOF, we are guaranteed that > the buffer IS NUL-terminated from the previous loop. It is only when > the file is empty that there was no previous line, and therefore no > guarantee of a NUL terminator in the buffer. Would it work to change > the behavior to add a NUL terminator at the time the buffer is first > allocated before reading from the file, and then have EOF leave the > buffer unchanged, instead of truncating the buffer with \0 at buf[0] > on EOF? We can ftello upon reading EOF as the first character of getdelim and terminate it only if the offset is zero. I'll have a look at that later today and adding some multiline test cases. Collin
