I checked in some nommu changes I've been wrestling with forever. It's about 1/3 of the pending architectural stuff, but was one of the bits blocking me figuring out how to get it to work right.
What I did was 1) replace toys.recursion with toys.stacktop, 2) add an xvfork() wrapper that forces any later xexec() call to exec() rather recursively calling toy_exec() in the same process. So yanked the toys.recursion variable and replacing it with toys.stacktop means I can do subtraction and approximate the amount of stack used, and use that as a proxy for leaked heap and stuff. It's currently forcing an exec when we hit 6k of stack used, which is small but the default nommu stack size is 8k. This also lets me set toys.stacktop to NULL to signal abnormal vfork-related states. (I played around with setting toys.recursion to -1 but it was awkward.) I also added an xvfork() to lib.h, which is a static inline because you can't wrap vfork() with a normal function. (Because vfork's child shares a stack with its parent, so if it returns from a function and then calls another one the parent gets confused trying to return from _its_ version of that function, the return pointer on the stack got stomped. Normally vfork() produces an inline system call, not a call to a function in libc. Possibly I should update nommu.org to explain that?) But the reason I need to wrap vfork() is A) to catches failures (fork can fail! Returning nonzero doesn't always mean you launched a child, -1 means we're out of PIDs!) and B) to zero out stacktop which tells xexec() never to recurse. Once we vfork(), we need to exec() to unblock the parent (and _until_ we do that, we're stomping on shared resources), so xexec() needs to know we called vfork(). I'm not quite happy with this new infrastructure because: A) some callers want to handle their own errors rather than error_exit(), so they have to zero out toys.stacktop themselves. (But I don't have a standard prefix for "I wrapped this function but not to error_exit() on failure. This is actually a persistent problem I may need to fix someday, but it balances against "how many wrappers do I _need_...") B) once a parent has zeroed stacktop it'll never recurse again. (Not until you exec, anyway.) The recursion is _mostly_ an optimization. When you fork and exec, the exec is something like 95% of the overhead and fork is really cheap in comparison. (Ok, there are pathological parents you can fork from where fork itself gets expensive, but if toybox ever winds up behaving like mozilla we've already failed.) However, the main time this comes up is running shell scripts (lots and lots of commands so it adds up), and toysh needs to do this manually rather than calling xexec() because it needs to run nofork commands in the current process context (stuff like "cd" or "export" that's a NOP if a child does it for you). On the other hand, sometimes the $PATH hasn't got stuff, because you threw a static toybox on a broken system or you just chrooted into a container or some such. In which case recursing is the only way to call other commands. (Example: mount may call losetup internally, and if we're using that to set up a container...) But I'm leaning towards not caring about that case because the _main_ user of it is "I dropped an exploit binary on a system and am gonna p0wn it now" which... isn't interesting. If you boot from recovery media, you can have an initramfs. If you're setting up a container, you can mount a tmpfs and then umount it again so you have something to set your $PATH to. Anyway, that's the stuff I've been working on. The remaining hard case is when you have to re-exec _yourself_ (if you vfork, you have to exit or exec to unblock the parent and stop stomping their stack and heap!), and the test case I'm using is cpio's passthrough mode. In those cases, I'm hijacking xpopen_both() so when you feed it a NULL for argv[] it execs /proc/self/exe with the existing toys.argv. I _was_ looking at "find your binary again even if proc isn't mounted!" which meant preserving the /path/to/argv0 in toys.argv and making sure we did the vfork() before we ever did a chdir and _still_ wouldn't work if there was a chroot in there, and I basically went "screw it: in the nommu support case, require /proc to be mounted to re-exec yourself. In the with-mmu case, just use fork()." So still banging on that. Sorry this has been blocking everything else but when I've got pending patches to main() it's hard to test anything else, and git is really annoying about pulling one tree into another tree with changes because it will NEVER MERGE, it just says "you have pending changes to this file which this pull will squash" and it being way at the other end of the file means nothing... Grrr. Working on it... Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
