On 8/27/22 20:11, David Seikel wrote: > On 2022-08-27 04:06:58, Rob Landley wrote: >> Honestly: people have already done this. A lot. What new niche is not being >> served here? Just "not invented here"...? >> >> Ahem. Backing slowly away, as previously mentioned... > > Have you stumbled across https://github.com/jart/cosmopolitan yet?
Just enough to sigh deeply, note that you can sing "Dunning Kruger" to Frere Jacques, and move on. I wasn't trying to make a complete list of C library attempts, I was complaining about making more of them without a clear reason for the new one other than "not invented here". This really seems to me like a "coke, pepsi, everybody else" situation. People didn't move from cvs/subversion/etc to git/mercurial/etc until proponents of the replacement could say WHY the new one mattered (and Linus Torvalds himself imposing it by fiat upon the largest open source project to achieve critical mass, which still didn't make anyone care about "sparse"). A whole lot of people tried to unseat "pc, mac, everybody else" until phones came along and did the actual sea change via a textbook disruptive technology transition. The only reason Linux libc didn't collapse FURTHER into "ascii, ebcdic, everybody else" or "qwerty, dvorkak, everybody else" (the way the source control world is kinda trending with git outpacing mercurial; there's a nominal alternative but you're unlikely to see it in the wild) is because glibc sucks so badly (but has historical dominance), and musl is younger than bionic (first android release was something like 2007, oldest commit in musl repository was 2011) which wasn't _trying_ to be a general purpose libc but was at the heart of one billion annual unit volume so couldn't be ignored either. That's why I care about 3 here instead of 2. (There's a reason Alpine Linux couldn't use bionic: only two hardware architectures still supported, a static hello world is something like 500k, we JUST got the "hello world segfaults in a chroot before calling main" problem fixed, still warns to stdout about usernames that don't start with the right mandatory prefix...) The reason we're saddled with glibc is that libc5 (then the one-and-only Linux C library) was replaced by Ulrich Drepper's threaded glibc fork when Java DEMANDED threading (no bindings for poll/select, the ONLY nonblocking I/O option was to let a separate thread block) and the famous "212% growth" of 1998 (TRIPLING the Linux userbase in one year) was all the Java developers switching over to Linux when Netscape released its source code and declared Linux the third "tier 1" platform after Windows and Mac. So they went from technically reasonable libc5 (small and young but not STUPID) to gnu/gnu/gnu/all-hall-stallman crap that the FSF literally tried to hijack away from the developer who'd forked it and added all that thread stuff for Linux: https://web.archive.org/web/20010903135128/http://sources.redhat.com/ml/libc-announce/2001/msg00000.html#:~:text=Stallman But glibc sucked SO BADLY that people were _always_ looking to replace it (hence uClibc and dietlibc making a go back around Y2K). > "Cosmopolitan Libc makes C a build-once run-anywhere language, I wonder if the people behind this have even heard of https://en.wikipedia.org/wiki/Executable_and_Linkable_Format#86open and https://en.wikipedia.org/wiki/Intel_Binary_Compatibility_Standard and so on... ELF didn't replace a.out because they were reinventing the wheel, it replaced it because they understood what "being better" meant and could communicate it. This is not an "if you build it they will come" situation. You're building it on the ruins of multiple previous attempts, some quite large and with the efforts persisting a long time. > like Java, Case in point. I was a full-time Java developer in 1998 and 1999, and taught "intro to Java" at Austin Community College a couple evening semesters. (AWT 1.1 was actually quite nice: swing with that model-view-controller nonsense was not.) I was there when everybody actually BELIEVED "write once, debug everywhere" was a thing. Corel (the wordperfect people) rewrote their entire office suite in Java (really: http://www.edm2.com/index.php/Corel_Office_for_Java) and wanted to make a Java handheld (https://www.cnet.com/tech/tech-industry/corel-pda-to-run-on-java/). "Like Java" is not an endorsement if Java couldn't make this work with mulitple billions of dollars of funding and a decade of effort from thousands of engineers at hundreds of companies large and small. (IBM had java as its religion for years. I worked for two different Java startups. My bug report is the reason Sun added a "truncate" binding to Java 1.2.) These days the are dozens of different bytecode languages out there. And I don't mean little ones: when you import a .py file and python writes a pyc file, that's the cached bytecode so it doesn't re-translate the library each time: https://docs.fileformat.com/executable/pyc/ Java didn't go away (for approximately the same reason Cobol didn't go away), and Android's full of Java... but the "write once run anywhere" goal went away. A java app for android is _for_android_. Also, I suspect the bytecode part's effectively gone away, because the "optimizing" step sounds a lot like IBM's old "Ahead of time" compiler: https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=components-aot-compiler So a native cached machine code version conceptually similar to the pyc above, but instead of saving the source to bytecode conversion it saves the bytecode to native code coversion. Wheee. (I think AOT was part of Jikes back in 1998? Dave Shields was a coworker at IBM when I worked on OS/2 right after college. He then retired and worked on spitbol.) > except it doesn't need an interpreter or virtual machine. Instead, it > reconfigures stock GCC and Clang to output a POSIX-approved polyglot > format that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + > NetBSD + BIOS with the best possible performance and the tiniest > footprint imaginable." "I dunno, I can imagine quite a bit." - Han "shot first" Solo Assuming their "polyglot format" is bytecode rather than "a copy of every known target's machine code in parallel like C++ template instantiation but for processors" (I.E. more or less the Apple Universal Binary approach: https://en.wikipedia.org/wiki/Universal_binary which, OS aside, would mean whack-a-mole for avr32 and longsoon and so on), and that they've never heard of nommu and consider it out of scope... That leaves 32 bit vs 64 bit (even when you fake 64 bit on 32 bit you still have things like limited mmap() size), big endian vs little endian, multiple different ways of handling unaligned access (I've used platforms that don't fault but instead mask out the bottom two bits so it always _makes_ an aligned access no matter what the pointer says)... Assuming they're restricting themselves to just x86-64 and arm64... How much testing is their netbsd target really likely to get? I'm constantly dealing with version skew just with toybox, which intentionally has no external library dependencies (well, -ish) doesn't do gui/graphics/sound stuff, punts on most internationalization issues, is going over VERY well-trodden functional ground and can blame posix for most things. Just last week a newer version of glibc caused a long thread where they made a bad architectural call and two different patches to toybox were made to work around it (but luckily the glibc devs backed off fixed it, which involved changing their minds TWICE, about the design decision AND about backporting bugfixes to existing releases, and I do give them credit for that). > Does all manner of wierd things, My context for "weird" includes the "odin" binary translator that converted Windows exe files into OS/2 exe files. On the filesystem, before you ran them. I did not make that up: https://www.os2site.com/sw/emulators/odin/old/index.html > including turning zip files into cross > platform executables. Last I checked a java jar file was literally just a zip file with a couple of required entries. I used to create them with info-zip instead of the jar tool. I've mentioned a post-1.0 todo item of having the toybox executable export its gzip plumbing as a zlib-compatible shared library so you could symlink "libz.so" to toybox and it would work. This is an old idea from the busybox days, and we got a proof of concept to work (I vaguely recall you need to build the binary as -fPIE and then have a truly ugly linker script that gives it both sets of ELF tables). But it's way way way way way down on the todo list. Having a zip file do that means you're gluing a zip file on to the end of a self-extracting archive header (which is an ancient trick https://en.wikipedia.org/wiki/ZIP_(file_format)#:~:text=BOF%20or%20EOF that even DOS did, and Phil Katz allowed zip to support it forever. The real zip metadata is at the _end_ of the file, which both makes it easy to append to a zip without rewriting it all, and makes even slightly truncated zip files useless.) However it goes, this binary is going to need a loader. binfmt_elf needs an ELF header. Windows needs a PE header. Mac needs a mach-o header. They are, as far as I know, mutually incompatible. Linux also has BINFMT_SCRIPT so #!/path/to/interpreter can load an arbitrary interpeter binary, and binfmt_misc lets you load arbitrary signature/command line combinations into the kernel at runtime, but this says "no interpreter" so needing /usr/bin/perl on the system to run your blob is apparently not what it's doing...? > Their flagship application is redbean, a stand > alone web server in an executable zip file. The simplest demo for a headless box. I was impressed when somebody fit one into 512 words of microcontroller memory (including the tcp/ip stack!) back in 1999: https://web.archive.org/web/19991012022727/https://www-ccs.cs.umass.edu/shri/iPic.html There's a web server built into python: https://pythonbasics.org/webserver/ https://docs.python.org/3/library/http.server.html Speaking of which, I should finish (add cgi, etc) the httpd I added to toybox, but haven't devoted a second weekend to it yet... > You add your web site to the > contents of the zip file, and it can run Lua based web pages. elua's bytecode interpreter is indeed cool, tiny, simple, and dependency-free. That's why it gets used in so many game engines (neverwinter nights and world of warcraft and so on) and used in so much tiny battery powered hardware: https://promwad.com/publications/article-electronicdesign-embedded-lua-microcontrollers https://www.electronicdesign.com/technologies/dev-tools/article/21798490/running-embedded-lua-on-microcontrollers Every time somebody sends me a fortran or lisp interpreter for toybox, I tell them that if they were offering a lua bytecode interpreter under 0BSD (sadly, mit still isn't _quite_ public domain equivalent), I might be interested. Of course, one big DOWNSIDE of lua is it doesn't ship with a nontrivial default framework, so you have to use external packages like https://github.com/luaposix/luaposix to do anything USEFUL with it, which I'm guessing this thing doesn't use? Sigh. It's hard to even DISCUSS something like that without context and vocabulary that don't seem to be common knowledge. There was going to be a whole section on "frameworks" in The Art of Unix Usability (the sequel book Eric and I were writing after The Art of Unix Programming, but alas we stopped being able to work together when Eric became a climate change denialist and started trying to explain to me why he thought the book "the bell curve" was right). Java had two obvious frameworks: applet and application context. An applet knew about the web browser it was running in (and the page it was loaded from, and the window it had to draw in or get mouse/key events from). An application didn't have any of that, but instead had basically libc bindings plus the AWT that could make one or more graphics windows. When running as an applet, you couldn't access the filesystem. When running as an application, you couldn't load for more pages under the URL you were loaded from. Note that "node.js" is a similar repotting of the javascript language from a webpage framework (the domain object model) to a libc-style framework allowing it to run as an application. Providing different bindings you can reach out and call: getcwd() or readdir() don't make sense in a browser because you're not "in a directory' (no local filesystem access), but an application _is_... Frameworks nest: two of the historically BIGGEST frameworks in Linux are gnome and kde, which are built on top of gtk and qt (which are built on top of x11 or framebuffer or a couple different 3d plumbing variants or...). The thing that makes it a framework is you have to pick one or the other, and can't (sanely) use both. Android and Linux use the same kernel, but programs written for them are operating within a different framework. Different libc is just a small part of this. Linux libc is the standard framework programs bind to to talk to the OS, although you can write C code that does NOT bind to this framework (for example https://lists.j-core.org/pipermail/j-core/2021-January/000950.html or what https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html was doing. And then https://www.muppetlabs.com/~breadbox/software/tiny/revisit.html and https://www.muppetlabs.com/~breadbox/software/tiny/somewhat.html examined going back IN to the standard libc framework. Linux's nolibc.h is an attempt to sort of provide an 80/20 minimal subset of the standard libc framework.) There are various standards that document the libc framework (c99 and posix and man7.org and the LSB 4.1 and so on) in slightly incompatible ways, and multiple different from-scratch implementations of the libc framework. And there are various standard supplements to it like zlib and openssh that extend the libc framework into a larger framework, without conflicting with it (you don't stop using libc in order to use zlib, but usually need libc to do anything with zlib because it assumes malloc() and open() and such). I don't think the people writing that necessarily know what a framework is, and I'm not interested in digging into it far enough to try to figure out where they think the edges of their project are. Pointing me at a project that goes "let's abstract away the entire operating system until there isn't one anymore" is basically making the same "I'm going to abstract away the hardware the kernel is running on until there isn't any anymore" mistake the microkernel people made. Build up so many floors in the skyscraper that the foundations of sand cease to matter. Why are they expected to do a better job than java did again? (I'm not saying a better job can't be done, but "expedition to find out why the previous expedition didn't return" does not inspire confidence.) With many years of diligent effort from a very large team, the thing you pointed at might manage to only suck as badly as cygwin. If they're very lucky. I could be wrong and thus pleasantly surprised, but... you know, poke me if so? Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
