On Sat, Apr 9, 2016 at 2:52 PM, Rob Landley <[email protected]> wrote: > I spent last week traveling to present at two conferences: > > http://flourishconf.com/2016/speakers.php?id=18 > > http://openiotelc2016.sched.org/event/6D9v/building-a-cpu-from-scratch-jcore-design-walkthrough-rob-landley-jeff-dionne-se-instruments > > Hopefully the videos will be up before too long. The slides are > http://landley.net/talks/flourish-2016.txt and > http://j-core.org/ELC-2016.pdf
Welcome back, and thanks for the links! I'm always looking for things to put on my queue of programming videos. I have to admit some ignorance about open hardware and nommu but it sounds interesting. > And several days before that reinstalling my laptop (well, before > reinstalling I tried to upgrade ubuntu 12.04 to 14.04 _without_ a > reinstall, that video is at https://www.youtube.com/watch?v=26RTlPgg-tA ). FWIW I think it's always wise to leave yourself a path to back out, rather than doing destructive updates (which I gather was an issue). I have two interchangeable OS partitions, and $HOME lives on a third data partition. So if the new OS has issues I can always go back the old one by rebooting. > So if you have a pending todo item that you think I should be working > on, giving me a poke to remind me where I was wouldn't go amiss. I think we have a few open threads: (1) expr has memory safety issues on master. I posted a message demonstrating this with the ASAN patches: http://lists.landley.net/pipermail/toybox-landley.net/2016-March/004884.html A lot of this has fallen out of my cache, but I think at this point it makes sense to just follow the original strategy and be done with it: http://lists.landley.net/pipermail/toybox-landley.net/2016-March/004827.html This patch is easy to inspect as correct... (and passes ASAN tests, etc.) The other ones require a lot of code reading and are not correct. The common case at runtime I suspect is 0, 1, or MAYBE 2 mallocs, so it basically isn't going to matter AFAICT ... if there is a fast way to grep for expr on say Aboriginal packages that could help answer this. (2) I sent patches for a couple buffer overflows found by AddressSanitizer. These are quite obvious and should be easy to review. http://lists.landley.net/pipermail/toybox-landley.net/2016-March/004852.html http://lists.landley.net/pipermail/toybox-landley.net/2016-March/004853.html Given that the test coverage is fairly low (e.g. there was an IRC-reported bug in 'paste', but no tests for paste), I think if we made a wide pass and added very basic "hello world" tests for every command, I'm sure we'd find dozens of real bugs. The nice thing about ASAN, etc. is that it increases the value of any tests you write. For every code path you hit, you get assertions for free. I'm sure that building Aboriginal with toybox_asan, etc. would shake out tons of issues. (3) The big test harness patches. I sent out patches to run tests with 3 sanitizers: AddressSanitizer, Memory Sanitizer, UndefinedBehavior Sanitizer. They are based on runtime instrumention, so there are pretty much no false positives (except maybe MSAN because I think you need to build libc with it too). Nonetheless they are finding real bugs. In addition, I got code coverage working a few weeks ago, which I think you expressed interest in. I didn't sent out a patch for it. This a 4th form of runtime instrumentation -- as I recall it requires a change to toybox source, because toybox calls _exit() rather than exit(). The runtime instrumentation hooks exit() to write the coverage data file. I'm wondering what the general opinion is on these... I think the expr example shows that they are quite useful. I saw that you started applying some of the fixes... although I would caution that there is actually some substantial knowledge/testing embodied in those patches -- I wrote them twice as mentioned, once in make and once in shell, and it definitely got better the second time. (4) I didn't respond to your last e-mails yet... we were talking about shell and make. So I actually have been cranking on my shell implementation in the last 1-2 weeks :) I prefer having something concrete to talk about. This is something I've wanted to do for a long time. I had been experimenting with Python, Lua, OCaml, Lisp, etc. over the years. And I guess toybox was to some extent an evaluation for writing it in C. But it ended up in C++, which is surprising to me (and probably to some people who have heard me complain about C++). But it's turning out really well -- one thing that enabled it is the re2c code generator, which is huge. It takes regexes and generates compact state machines with no dependencies. It's modular and doesn't force your code into weird patterns (very much the opposite of lex/flex). For the parser, I took the POSIX shell grammar and ported it to ANTLR (but I'm NOT using ANTLR for any code generation, because it's ridiculously unsuitable for that, in contrast to re2c). One thing I didn't quite realize is how much context-free grammars are like code... they need refactoring, testing, and have performance implications. IMO the POSIX grammar is not really suitable for source code, although bash seems to have done that... ANTLR actually does a nice lookahead analysis on the grammar, which can help performance. And it's a good tool because it uses top down parsing algorithms, which fits the shell a lot better than bottom up parsing like Yacc. I'm close to the point of porting this debugged grammar to a recursive descent parser, which will then be able to accomodate all the special cases that POSIX lays out. (I am trying to avoid the asymptotic approach to correctness that some shells seem to use...) I also wrote a little shell test framework and have been torturing the nooks and crannies of bash and dash. I found one place where bash is not POSIX conformant and dash is, and some other differences. All in all I think the POSIX grammar is pretty good and helpful, although it only covers a single sublanguage out of the 3 or 4 that are in the shell. So, the implementation obviously isn't going to be in toybox directly, but it could be used for Aboriginal. That's still in the future though, since I've only implemented a very basic runtime. It executes basic commands, but nothing else. 95% of the work has been lexing/parsing so far. I will have to write something up, because it seems there hasn't been a modern implementation of shell in awhile. Most implementations use gotos, global variables, and macros a lot -- and not in a good way. For example, bash's function/macros to get the next token is somewhat ridiculous, and ash/dash use some horrible goto tricks as well. The mksh lexer seems somewhat principled although I didn't follow it all. Andy _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
