On 7/14/25 10:01, JeanHeyd Meneide wrote:
On Mon, Jul 14, 2025 Alejandro Colomar <un...@alejandro-colomar.es> wrote:

Hi Chris, Jakub,

I was talking with Elliott Hughes (Bionic maintainer) and Rob Landley
(toybox), and Elliott reminded me something:

Why did the committee standardize typeof() at all without standardizing
({})?  They almost always come together.

      Because the Committee moves slow and people don't advocate for
things that are important, so things have to come one feature at a
time when someone finally picks it up. For the record, one of the
earliest mentions of typeof as a potential candidate for
standardizationw as in the earliest-available C Rationale document; it
says it needed more bake time despite people's excitement.

      Then nothing happened for 30 years.

      Also, Statement Expressions were met with great enthusiasm in
2007 to be standardized after a paper by Nick Stoughton surveyed all
of the existing extensions at the time.

      Then nothing happened for 19 years.

Toybox's move from claiming c99 to claiming c11 (https://github.com/landley/toybox/commit/3625a260065b and thus https://github.com/landley/toybox/commit/0c566f6f9a05) was in 2022. At the time, I thought __has_include() was moving from "compiler extension" to "standardized", but Elliott corrected me, which was when I asked about ? : ala:

http://lists.landley.net/pipermail/toybox-landley.net/2025-July/030769.html

And here we are.

My last engagement with the C committee was probably back around 2023 (I'm guessing circa https://landley.net/notes-2023.html#05-02-2023) when I wanted to implement "read" in toybox's shell but line reads using readline() and friends used a FILE * that did automatic readahead into the FILE * buffer, and there was no way to ask that FILE * how much readahead data was in said buffer so I could fread() it back _out_ and pass it on to the child process, which has to operate on the underlying filehandle because that's how processes work:

Ala:
$ echo $'one\ntwo\nthree\nfour\nfive' | \
  { read i; echo GOT=$i; head -n 2; }
GOT=one
two
three
$

That's the desired behavior, but if I don't rescue data out of the FILE buffer the child process may not see any input because because the fread() ate it all as readahead and never gave it back.

For a seekable file I can fseeko(ftello(fp)) but the above is a pipe, which I can't rewind, so I need to pass on the data that was read. I couldn't find a portable way to ask a FILE * "how much can I fread() without pulling more data from the filehandle". And I mean it's frustrating, they BRACKET THAT with a bunch of accessor functions, there's __fpending() for output bytes:

https://linux.die.net/man/3/__fpending

But nothing for INPUT BYTES read ahead in an INPUT STREAM. (And Elliott got sad at my read_line() function that read input a byte at a time to avoid ever overshooting, because it was slow. Hence trying to make larger read sizes work.)

When I asked the posix committee they said that FILE * was opaque to them and to go ask the C standards committee, and the C standards person who replied to my query through the web form thingy said that filehandles weren't in ISO C at all. (At least posix had fdopen() and fileno() to translate between the two. The posix side was willing to reach out, but not to standardize or provide an accessor function to the contents of the other committee's struct. And the C committee wouldn't do it because doing I/O through anything OTHER than a FILE * was "nonstandard". Apparenntly child processes inheriting stdin/stdout/stderr from a parent are not their problem.)

So even though FILE * always had a variable storing the amount of remaining input data (it HAS to) the member had a different name on glibc and on musl and on bsd/mac, with no standard accessor functions, and neither side wanted to standardize this because each felt it was the other's responsibility.

(Note that if you set O_DIRECT on the pipe on Linux, it delivers data in the same granularity it was produced (I.E. it doesn't merge buffers, each read() stops short at the boundaries of the corresponding write()) which MOSTLY fixes this for real world inputs, although the above test is still borked because echo is producing a single atomic write that read(bufsiz) presumably eats before scanning for \n so it would still consume future lines, but that's as close as I could get and called it good enough for now.)

      So, the short and long of it is that if someone doesn't do it, it
doesn't get done.

I moved toybox from saying C99 to saying C11 (in 2022) to work around a compiler bug in clang. At the time, I thought the __has_include() I was already using (on both gcc and clang, between that and the macros in the big ":|cc -dM -E -" dump I eliminated almost all compile-time probes) came with the new standard. I was also already making regular-ish use of things like:

  else TT.hdr.mode |= (char []){8,8,10,2,6,4,1,8}[tar.type-'0']<<12;

Which vim's syntax highlight only stopped turning red for last year. (Well, debian version upgrade.) Everything except other way of declaring inline worked fine in -gnu99 or whatever it was I'd been building under before on gcc and clang, the move to claiming C11 was mostly "eh, I'm apparently already using it" and it was a dozen years old at that point. (And I'd finally given up on backwards compatiblity regression testing against Unbuntu 2008 because it was missing kernel features I was using. Although the "centos forever" guys actually using the 10 year support horizon made sad faces at me shortly afterwards because I broke them.)

Mostly I just test which compilers have what and make do with the reality at hand. It's a lot easier when you write off Windows as "never to be supported", then you can rely on things like LP64. :)

If there was a standard that let me remove giant horrible things like:

https://github.com/landley/toybox/blob/master/lib/portability.h
https://github.com/landley/toybox/blob/master/lib/portability.c

I'd be more interested, but this seems unlikely.

(Honestly I'm probably mostly just writing C89 with compiler extensions. With LP64 I don't really need uint32_t and friends... I guess %p wasn't in c89? I claimed C99 because I had online copies of the C99 spec and _didn't_ have online copies of C89...)

Sincerely,
JeanHeyd

Rob
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to