On 10/02/2015 01:31 PM, Rich Felker wrote: > On Fri, Oct 02, 2015 at 08:00:24AM -0700, Isaac Dunham wrote: >> [in the context of long running processes] >>> Rich Felker wrote: >>>> Isaac Dunham wrote: >>>>> Agreed. But then, wouldn't xrun() also be the wrong thing? >>>> Why? >> If I'm not mistaken, xfork() and XVFORK() will perror_exit on failure. >> This results in a fork-bomb killing a long-running hotplug helper. > > Indeed. I suspect being robust against resource-exhaustion conditions > is going to be harder than just avoiding these 'x' functions, but it's > a necessary condition.
Which I have infrastructure for, although making daemons not call the xwrap functions in the first place is easier to clean up after. It's not just daemons, by the way. The shell is another "shouldn't randomly exit when it hits a problem" command, although stdout closing is still numberwang. There's some todo plumbing I need to think more about, such as writing -1000 to /proc/self/oom_score_adj and -18 to /proc/self/oom_adj. If a daemon really wants to tell the oom killer that it's not a first choice for killing, there are ways to do it. The giant uclinux triage I just added to the roadmap included a "nooom" command that basically did that, in the mode of "nice" and "detach" and "chroot" and "nsenter" and so on. (You could make a longish command line with taskset and timeout and setsid and time and so on. I have no idea why Eric Raymond decided this was called "Bernstein Chaining" since it predates that guy by decades. Heck, "ssh" and "sudo" are commands that executes the rest of their command line in a given context. So does xargs. It's a thing.) Anyway, proper daemons are a thing. I have tcpsvd in pending that I need to look at (and possibly merge with netcat, although tail -f needs similar plumbing too...). > It would be nice to audit all the toys that are > intended to be long-running rather than commands that just do their > thing and exit to reduce or eliminate any fatal exits after they reach > the 'long-running' part. Except strlower() calls xstrdup() in the I18N case and dlist_add() calls xmalloc() and dirtree_add_node() calls xzalloc()... You can _try_ to avoid it, but it's not a simple thing to audit. (And no, you can't check whether or not xexit() and such are linked in because common infrastructure like toy_init() and the option parsing logic use them.) This is why I have the "longjmp back to a recovery point instead of exit" logic. It may leak resources (although we can _try_ to avoid and clean up after that) but it lets you recover from failures. Currently only toysh is using it (that was my prototype implementation and proof of concept) but the concept is genericizable. The standard idiom in toybox is to abort on fatal errors, which is the right thing to do 90% of the time and means we're not _ignoring_ errors by failing to check for them. I can't change that idiom for the remaining 10%, but I can convert it into exception handling with throw/catch. That's not ideal, but it's workable. (I have actually thought about this before. It's on the todo list. And it affects the nommu stuff too because allocation failures are _much_ more likely in a context where all allocations must be contiguous and memory fragmentation limits your maximum allocation size, so malloc failures aren't just due to resource exhaustion there...) > Rich Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
