On 10/8/22 22:08, Ray Gardner wrote: > Rob, let me know if it's inappropriate to discuss your programming-related > blog > posts in the toybox list.
Eh, it's not so much "inappropriate" as I don't... have the energy? See the point? I was expressing annoyance, not a call to action. I don't have influence over the standards committee or the compiler developers. Long ago I used to maintaining a tinycc fork and was planning out qcc, but I haven't had "skin in the game" in that area for years. I USED to be up to date on all this stuff. Way back in the dark ages I read Herbert Schildt's annotated C90 spec cover to cover and had it more or less memorized. (And yes, I'm aware the cost difference between the official ISO copy of the spec and Schildt's annotated version was said to reflect the value of Schildt's annotations.) Back in college I was trying to track down the "simple C compiler" from Dr. Dobbs Journal as a basis to write my own (for that platform independent binary executable code project thing; it meant I'd have two kinds of function pointers but DOS had "near" and "far" pointers already so... eh. Gave me a weird perspective when I was introduced to java a few years later.) But that was a very long time ago, and my imperfect memory of that is something like 3 specs behind the times (2 of which I already care about), and predates the move to 64 bit (and thus the LP64 standard... heck, there were still plenty of 16 bit systems back then) and the the rewrite of C compilers in C++ allowing the C++ developers to actively try to expand "undefined" behavior in C. I'm unlikely to engage with gcc: any expectations I had for sane behavior from them went out the window somewhere between https://lwn.net/Articles/259157/#:~:text=Stallman and https://lwn.net/Articles/390016/ . The PCC project seems to have rolled to a stop again because Apple poured money into LLVM followed by Google, and it became the Designated Alternative; I note I pushed Qualcomm in that direction when I did a 6 month Hexagon bringup contract for them a decade ago, and 6 months later my old boss gave this talk: https://www.youtube.com/watch?v=nfyuFPc5Iow . I've never gotten any traction with the LLVM developers, no email I sent them has ever been replied to. (But then I've only ever sent like 3? And am not subscribed to their list. I'm still subscribed to the pcc and tinycc lists, but not that one.) I only finally acknowledged that C99 wasn't enough because the (typecast){constant} thingy was useful, and because we needed to work around that noreturn bug. I mostly treat C changing the same way I treat driving rules changing. I don't want to have to _study_ for my license renewal, I'm just trying to get from point A to point B. The actual driving part is a cost center. Undefined behavior is a cop-out, and there's always a reason for it if you dig deep enough. The mainframes and minicomputers it was glossing over differences in are all long-gone, these days it's because optimizer writers introduce bugs and call it a feature. You can argue minutiae but I reject the category conceptually: "then define it already". Be consistent and have regression tests. These days mostly I just test what the compilers I use accept (currently that means gcc and llvm across 16 architectures and 3 C libraries), and react when something breaks. Which is how scripting languages work anyway. If this doesn't provide enough coverage, it means I don't have enough testing. I'm slowly incorporating ASAN into my workflow since at least so far it's NOT a significant false positive generator, which is refreshing. I'm neither enough of a language expert to dictate this stuff at anyone (experienced sure, up to date not really, authoritative definitely not), nor do I have the energy to traverse the political quagmire. If I cleared my plate enough to make a serious go at qcc, then I'd have start to caring again. But if nobody else seems to understand _why_ I considered that important, let alone grabbed a shovel, presumably it won't be missed. > I asked the folks at StackOverflow about this, and I think the consensus > (and my own understanding) is that the warning is bogus, Bogus warnings are normal. GCC _still_ has the "may be used uninitialized but provably deterministically isn't" warning, and LLVM still needs -Wno-string-plus-int which gcc doesn't recognize. (Which is why I moved it to scripts/portability.sh .) Yes "string"+4 is a valid thing to do in C, yes LLVM warns about it anyway, because it thinks its users don't know what they're doing. That is why that warning exists. Similarly, if (a=b) isn't wrong, but somebody decided "add extra parentheses to show you MEANT to use an assignment instead of a comparison", and me going "runtime testing finds the bugs, doo dah, doo dah..." doesn't change what the compiler authors decided to do. (Seriously, bash doesn't warn about "if ((x=4)); then echo hello; fi" being an assignment instead of a comparsion, because it doesn't NEED to. Note that the double parentheses there aren't "warning suppression", it's how the shell does math. It's basically $(()) except the result becomes the return code instead of resolving to a decimal string.) (A --lint mode that produced the "this could be a false positive" warnings wouldn't be so bad... but then you get people forcing that on in their builds and requiring it by policy as part of "fortify" and so on...) > But I think some of your other assertions are wrong. Entirely possible. I'm neither positioning myself as an expert nor trying to keep all that current. > That's also undefined in C, as far back as 1974 (Ritchie). And yet it doesn't produce a _warning_ for that. Presumably because it would produce all the false positives in the world, unless they special cased printf or something. And it works. It's possible there's a compiler out there it doesn't work on, but I haven't encountered one yet, going back through slowaris and aix to at least OS/2. The bigger issue in _this_ area is usually that printing to stderr and stdout gets reordered relative to each other without extra fflush() calls. Which again: no warning about that, you find it in testing. I don't want the compiler to babysit me. I need to test stuff. Upgrading compilers introduces regressions all the time, just like upgrading libc, upgrading kernels... Debian's pretty good about apt-get update not breaking much but Red Hat was a _minefield_ and I lost a gentoo system that stopped being able to build ANY packages after an upgrade (thus being unable to install anything new after that due to its build-from-source design). And distro major version upgrades derail build environments all the time, that's why AOSP specified specific Ubuntu and Debian versions to build on all those years. By the way, the glibc people who are theoretically most scrupulous about this crap? https://landley.net/notes-2022.html#28-08-2022 They lock their projects to only build with specific versions of THEIR OWN COMPILER. Which they produce. It is very difficult to take ANY of this seriously when the people who push it out onto the world are doing that at home. > I've read somewhere (can't find it now) that some standard committee members > aren't happy with the way the anti-aliasing rules (C standard 6.5 par. 7) are > stated. The language's creator objected to stuff the standards committee kept adding: https://www.lysator.liu.se/c/dmr-on-noalias.html Alas, he died in 2011. And mostly stayed out of committee politics long before then. > But the order of evaluation issues were pretty well settled in C89 with the > "sequence point" concept that mostly just clarified what Ritchie had stated > for 15 years already. "Isn't currently standardized" doesn't mean "will never be". ANSI C was very clearly that char was NOT guaranteed to be an 8 bit byte. (Technically unix started on a pdp-17 with an 18 bit word size that was 3 6-bit bytes.) Eventually the systems that applied to all died. LP64 not being part of the C spec today is because Microsoft lobbied against it, everything else I'm aware of is LP64*. We no longer have 6 bit bytes, binary coded decimal, drum memory... C is a tool at the level of a hammer and chisel. Defined behavior is good. Adding bells and whistles, not so much. But again: opinionated != gearing up to leverage political change in this area. I'm mostly happy when they don't make it _worse_. As David Graeber said in his famous 2013 essay, I have other fish to fry... Rob * Despite the name, it essentially incorporates LP32 by reference: Long and pointer are the same size so pointer fits in a long, the other 4 base integer types have explicitly defined sizes. That's it. Supporting 16 bit processors under that regime doesn't come up much anymore: atmel avr was 8 bit and so's 6502, beyond that I've mostly seen a jump up to 32 bit this century, ala avr32 or cortex-m. But technically a 16 bit processor could have 32 bit "int" the same way a 32 bit processor can have 64 bit "long long", it's an array of the type you can handle and libcc.a calls a function to do math on it. Whether such a theoretical 16 bit system would have long be 2 bytes... again, really hasn't come up that I've noticed. The 8086 was a horrible hybrid with 20 bit pointers needing 2 registers to access memory, that's where near/far came from and it's part of that legacy we've left behind with 6 bit bytes... _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
