On Thu, Aug 15, 2019 at 3:36 AM Rob Landley <[email protected]> wrote: > > > > On 8/14/19 11:58 PM, enh wrote: > > On Mon, Aug 12, 2019 at 8:52 PM Rob Landley <[email protected]> wrote: > >> > >> On 8/12/19 3:33 PM, enh via Toybox wrote: > >>> last week i turned on the toybox tests in presubmit for changes to > >>> toybox in AOSP, and postsubmit otherwise. so we have a fair bit of > >>> data now (sadly no statistics because non-gtest tests aren't that well > >>> integrated yet). > >> > >> Yay! Thanks for improving the test suite. > >> > >>> bad news: > >>> * i can't merge my own find(1) fix because the test i added is so > >>> flaky i can't get it in. (but really, that's a feature, and should > >>> count as good news.) > >> > >> Which find fix? > > > > the dangling symlinks one. you already merged the test fix, but i just > > sent you another fix now i've run it on Android, not just my laptop > > :-/ > > I just looked and replied to that. See my concerns about internationalization > in > libc... > > >>> good news: > >>> * very little flakiness so far. i've only seen a couple of tests (one > >>> cpio, one tar) failing, and both make me suspicious that there's still > >>> some bad behavior in dd somewhere. > >> > >> It's in pending for a reason. :) > > > > yeah, i learned my lesson though --- nothing in pending is used for > > the build :-) > > I need to clean up dd. I'm reaching the end of my 6 months off (probably > flying > straight from ELC to a job site for work) and the big things I got done were > tar > (still need to add xattr support) and about the first 1/3 of toysh. (Trying to > make toysh at least usable before the talk on the 22nd, but it doesn't expand > environment variables yet, and for the past ~3 days I went down the rathole of > rewriting the parsing logic to handle (()) because for ((;;)) is a syntax I > while filling out the flow control logic. I hadn't noticed busybox ash doesn't > support it before now...) > > Working as fast as I can. Hobby time is finite. :( > > (Did you know busybox hush is 11000 lines and busybox ash is 14000? I thought > they were smaller. Hadn't looked at either in years, did a wc on both > yesterday > filling out my talk outline, and was surprised. I'm still trying to keep toysh > under 3000 lines, but we'll see...)
mksh is 35kloc. tcsh is 60kloc. i remain skeptical of an "i can't believe it's not bash" shell that's also small. iirc inferno had two shells, a tiny one and a small one (docs for the small one: http://www.vitanuova.com/inferno/papers/sh.html), but they weren't in C, so their size differences are probably more due to that. > >> BUT cursor up and hit enter, and you get infinite hello. > > Don't get me started on bash's command history not producing the command you > typed in. I wrote this in yesterday's blog entry: > > Dear bash: if I end a line with a #comment but there's a continuation, and I > then cursor up in the history? Don't glue the later lines after the #comment > so they get eaten. > > $ burble() #comment > > { hello;};burble > hello > $ burble() #command { hello;};burble > > The two are not equivalent. (You can do multiline history! Really!) > > (I've now got blog entries edited and posted up through the end of March! > Given > that "catch up on my blog" was my lowest tier patreon goal and it got met a > while ago, I feel _really_ guilty about that. Working on it, but also making > sure to regularly-ish write _new_ blog entries with enough detail I can turn > them into something coherent later, so it's a moving target...) > > >> Oh, and ;; is a special token but AGAIN, in this special case: > >> > >> $ for((;;));do echo hello;done > >> > >> It turns into two separate ; (presumably retroactively again). > > OR you could rewrite the parsing logic to treat (( like $( as a form of > quoting > context, and thus the whole thing is one big token so the ; and < recognizing > logic doesn't trigger because it's not a seperate token but instead in the > middle of a larger word (which then gets broken down by math recognition logic > and variable expansion logic, respectively). > > I have to add a hack to run_function() to recognize a command starting with (( > and ending with )) but that's fine. (I need to add _another_ one to the > for/select control flow logic anyway, because for ((;;)) has two semicolons in > it and standalone ((math)) doesn't understand semicolons. The "parse this math > expression string and give me back the value" function is the same, but each > has > to call it separately after parsing, ) > > Except in MY run_function() I'm putting the ((math) recognizer _after_ the > local > variable assignment logic, and bash put it before: > > $ X=4 ((X<5)) && echo true > bash: syntax error near unexpected token `(' > > It works if you put a semicolon after the 4 but there's no obvious reason > you'd > NEED to. :P (And WHY does bash use backquote and quote? Why is it trying to do > Microsoft Smart Quotes in ASCII? Seriously, can someone convince the > gnu/dammit > guys to just stop please?) it looked right (by accident) on a real VT100 :-/ > Oh, and any variable that's waiting for a closed parentheses as its end quote > has to count quote depth: > > $ echo $( ( ) > > > > It's a line continuation. But: > > $ echo $( "(" ) > bash: (: command not found > > isn't. So I spent a large chunk of yesterday teaching parse_word() how to > handle > all that, which is only TANGENTIALLY related to implementing for/select. > (Which > has its own fun. Did you know "for name do echo hello; done" only needs one > semicolon? for name in; do echo hello; done" needs a semicolon. The for/select > syntax is actually: > > for {((;;))|name [in...]} do > > So you can have: > > $ for ((i=0;i<1;i++)) do echo $i;done > 0 > > (You can have a semicolon or newline before/after the do, but not after the > for. > The one before done is required.) Or you can have: > > $ boingy() { for i do echo $i; done }; boingy uno dos desqview > uno > dos > desqview > > because a for with a variable name but no in list > > Or you can have an in list, which can be on the next line after the for. (The > variable name can't be: you either have a ((;;) block with exactly two > semicolons or you have a variable name which can't have an = in it... > > Ok, there's a whole "what constitutes a valid shell variable" thing I'm > unhappy > about. Did I already rant here about bash filtering out variable names it > doesn't like at startup? Or was that just in my blog you haven't seen yet > because the web's still months behind? Bash only allows a-zA-z0-9 and _ and > the > first letter can't be a digit, but you can set _way_ more variables than that > with env, up to and including a zero length variable name. (env =hello env | > grep hello). And bash just drops 'em. That's insane, it means you can't have > utf8 variable names! It's based on a misreading (I.E. conservative reading) of > posix and I'm not putting up with it. BUT I can't have a variable called > ((;;)) > accepted by for as an assignment target either, can I? > > Oh, and this: > > ((1+2)) > frog > > works for some reason. It seems to be the same post-block redirection context > as > > if true; then for i in 1 2 3; do echo $i; done > frog; fi < walrus > > Except I'm unaware of any circumstances under which it can produce or consume > output? Oh well, at least I can work out where to stick it... > > P.S. At some point I need to go reread the posix spec on what THEY think the > shell is supposed to do, but honestly it would just be a distraction right > now. > I intend to implement what bash DOES based on reading the bash man page and > help > output, and then sticking bash in the comfy chair and poking it with the soft > cushions until it confesses and writing down Too Many Tests (see attached, not > REMOTELY done and I need to turn them all into machine runnable tests)... > > And then maybe compare it with what posix says _afterwards_, but they're > behind > the curve here. I blame Jorg Schilling. > > >> I am adding tests to the giant test file and grimly implementing. Alas > >> since the > >> syntax checking pass and the running pass are different (I'm sharing as > >> much > >> code as I can between them but they fundamentally do different things), I > >> have a > >> lot more syntax parsing than executing yet. :) > > > > i don't know how you have the stomach for the shell... just thinking > > Never underestimate the motivational power of spite: > https://www.youtube.com/watch?v=2t-hyB8ibgk > > (Hmmm, maybe I should cover that in my talk on the 22nd?) > > > about it makes me miss > > [rc](http://doc.cat-v.org/plan_9/4th_edition/papers/rc) which -- like > > so much of Plan 9 -- sadly wasn't better _enough_ to make a > > difference. utf8 being the obvious exception. > > And /proc, and kvm's virtfs is the 9p.linux protocol under the covers, and > bash > implements /dev/{tcp,udp}/address/port as a redirection target... > > Plan 9 was great, the problem is it was proprietary. They charged money for > the > binaries for pretty much its entire development history, and you couldn't even > BUY source until 1992 (when the cheapest academic price was $350). They didn't > open the source until 2002 (and then under the "lucent license" which couldn't > share code with anything else), by which point that ship had sailed. > > It doesn't matter what your OS does: staying closed source binary only a > decade > after Linux came out was not interesting. IBM couldn't make that work with > OS/2. > And by the time Plan 9 DID open its source it was old news. It had been going > for 20 years and people said if it was going to make a splash it would have > done > so before now, so they didn't look at it _then_. First it was closed source, > then it was old. Being payware prevented anybody from having a copy, and if > nobody has it and nobody can see there's nobody to tell anybody _else_ about > it > and its community couldn't grow... and then the lack of community was taken as > evidence it was uninteresting. :( > > Plan 9 failed for the same reason Unix succeeded. Unix was open source because > it couldn't NOT be: copyright wasn't extended to cover binaries until 1983, > and > Ken Thompson took a year off from Bell Labs to teach at his alma mater (the > university of california at Berkeley) in 1975, including two semesters of a > class on "this OS I wrote", spawning BSD maintained by the students he taught. > > > i had a look at escape characters at the weekend, thinking i'd clean > > up some of the duplication, mainly so i could remove the > > even-more-duplication in my hexedit patch that's the reason you > > haven't seen that yet. > > I keep meaning to sit down and add a hd command, but I'm not sure where to put > it. (In hexedit? In od? I could make it standalone, but hd is really "hexdump > -C" and full hexdump is basically a different syntax for posix od, so that's > probably the logical place to stick it...) > > > (but also so i could add \u without adding it > > multiple times) ... and all i ended up with was a bunch of tests > > showing how insane and different echo and printf are. > > Sadly, there's a reason they don't share code. (Well, modulo the unescape() > function in lib/lib.c, but that's trivial.) > > > the differences > > between printf's direct interpretation of escapes and its %b > > interpretation particularly made my day. > > I made it #*%(&#( _work_. And added tests. And have a todo bug sitting here > about %b getting something wrong, which is related to this uncommitted patch > in > my printf.c: > > @@ -44,11 +44,14 @@ static int handle_slash(char **esc_val, int posix) > // 0x12 hex escapes have 1-2 digits, \123 octal escapes have 1-3 digits. > if (eat(&ptr, 'x')) base = 16; > else { > - if (posix && *ptr=='0') ptr++; > - if (*ptr >= '0' && *ptr <= '7') base = 8; > + if (*ptr!='0') posix = 0; > + if (ptr[posix] >= '0' && ptr[posix] <= '7') { > + base = 8; > + ptr += posix; > + } > } > len = (char []){0,3,2}[base/8]; > - > +dprintf(2, "len=%u base=%u\n", len, base); > > But unfortunately I don't remember what the actual _bug_ was. I thought I had > a > failing test in tests/printf.test for it, but git diff shows that file is > clean? > Maybe it was in an earlier tree? I'm pretty sure it's in the back mailing list > posts, but they're not easily searchable. Maybe a back blog entry... actually, that looks like the fix for one of the GNU vs toybox differences i ran into. if you do actually want to be fully (what i think is) bug compatible, between us i think we have the tests and the fixes :-) > > i'll come back to that at > > some point when my stomach's recovered... > > Welcome to my world. > > Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
