On Sun, Jul 10, 2016 at 10:28 AM, Rob Landley <[email protected]> wrote: > Awk's better than bc.
That's interesting... I had no idea bc was a language with functions and loops! https://github.com/jck/822kernel/blob/master/kernel/time/timeconst.bc This is the problem with DSLs... shell, make, awk, and presumably bc all started out as very specific languages, for different purposes. Over time, they all grew a C-like imperative language. And nobody wants to remember 3 or 4 different syntaxes for that: f() { echo hi $1; } f "bob" function f(name) { print "hi" name } f("bob") define f hi %1 end $(call f,"bob") (And repeat this mess for every other construct in a language...) It does seem that if you rule out Python/Perl, awk is the winner with respect to code generation, based on the fact that a lot of C code uses it (many shells, Android, FreeBSD, etc.) And I agree with the idea of minimizing build dependencies. However, I did a bunch of research and hacking on Kernighan's Awk. I was trying to morph it into a "proper" modern language. For example, you could imagine writing "ls" or "xargs" or even a shell in Awk, sort of like the idea to write tools in Lua. But then I ran into some big limitations, like you can't return associative arrays from a function, or pass or return functions to/from functions. Awk looks very similar to JavaScript -- C syntax with associative arrays, but is semantically much more limited. I lost interest in awk because of these limitations. awk is used, but seems to be waning, and it's not really evolving. (But I haven't lost interest in the shell.) I did however automate and slightly rewrite Kernighan's EXTENSIVE test suite here, which is AFAICT is not in the other Git reconstructions: https://www.cs.princeton.edu/~bwk/btl.mirror/ https://github.com/onetrueawk/awk I think you mentioned you were looking for an awk test suite. Well there it is -- there are hundreds or thousands of test cases, including for the regex language. I actually ran it under LLVM sanitizers (ASAN/MSAN/etc.), just as I did for toybox, and it revealed the expected C coding bugs, in this code being maintained by one person for 30 years... (BTW you never responded to my last message about that) I will publish the combined repo at some point, and if anyone has a burning need I can accelerate that. I should make a blog post at some point, demonstrating the sanitizers on old C code... though unfortunately writing about code takes just as much time as coding itself. And I didn't actually fix any of the memory problems that I found, as I did for toybox, since I don't have any plans for that code in the future. The bottom line is that LLVM sanitizers are mandatory if you care about bugs... nobody is careful enough (even Kernighan, with his astoundingly thorough tests, much more thorough than toybox!). toybox wget, tar, and crypto/compression libraries especially need this, because they process untrusted input. The other point about Kernighan's Awk is that if I were building something like Aboriginal Linux, I would just use that for now, and put toybox awk at low priority. Elliot showed me that the Android NDK actually uses a copy of Kernighan's Awk and not the system awk for its builds. I get why you don't like GNU stuff. But Kernighan's Awk is like 7 files of pure ANSI C, POSIX yacc, POSIX makefile, etc. that builds anywhere. Kernighan also expanded the yacc to c, so you don't need yacc as a build dependency... that is a little "unprincipled" but I think fine given that awk is changing so slowly and will likely not need any maintenance. Andy _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
