Re: [Toybox] pull: fix modprobe, login, switch_root, improve init, reboot

Rob Landley Sat, 03 Oct 2015 03:45:41 -0700

On 10/02/2015 01:31 PM, Rich Felker wrote:
> On Fri, Oct 02, 2015 at 08:00:24AM -0700, Isaac Dunham wrote:
>> [in the context of long running processes]
>>> Rich Felker wrote:
>>>> Isaac Dunham wrote:
>>>>> Agreed. But then, wouldn't xrun() also be the wrong thing?
>>>> Why?
>> If I'm not mistaken, xfork() and XVFORK() will perror_exit on failure.
>> This results in a fork-bomb killing a long-running hotplug helper.
> 
> Indeed. I suspect being robust against resource-exhaustion conditions
> is going to be harder than just avoiding these 'x' functions, but it's
> a necessary condition.


Which I have infrastructure for, although making daemons not call the
xwrap functions in the first place is easier to clean up after. It's not
just daemons, by the way. The shell is another "shouldn't randomly exit
when it hits a problem" command, although stdout closing is still
numberwang.

There's some todo plumbing I need to think more about, such as writing
-1000 to /proc/self/oom_score_adj and -18 to /proc/self/oom_adj. If a
daemon really wants to tell the oom killer that it's not a first choice
for killing, there are ways to do it. The giant uclinux triage I just
added to the roadmap included a "nooom" command that basically did that,
in the mode of "nice" and "detach" and "chroot" and "nsenter" and so on.
(You could make a longish command line with taskset and timeout and
setsid and time and so on. I have no idea why Eric Raymond decided this
was called "Bernstein Chaining" since it predates that guy by decades.
Heck, "ssh" and "sudo" are commands that executes the rest of their
command line in a given context. So does xargs. It's a thing.)

Anyway, proper daemons are a thing. I have tcpsvd in pending that I need
to look at (and possibly merge with netcat, although tail -f needs
similar plumbing too...).

> It would be nice to audit all the toys that are
> intended to be long-running rather than commands that just do their
> thing and exit to reduce or eliminate any fatal exits after they reach
> the 'long-running' part.

Except strlower() calls xstrdup() in the I18N case and dlist_add() calls
xmalloc() and dirtree_add_node() calls xzalloc()... You can _try_ to
avoid it, but it's not a simple thing to audit. (And no, you can't check
whether or not xexit() and such are linked in because common
infrastructure like toy_init() and the option parsing logic use them.)

This is why I have the "longjmp back to a recovery point instead of
exit" logic. It may leak resources (although we can _try_ to avoid and
clean up after that) but it lets you recover from failures. Currently
only toysh is using it (that was my prototype implementation and proof
of concept) but the concept is genericizable.

The standard idiom in toybox is to abort on fatal errors, which is the
right thing to do 90% of the time and means we're not _ignoring_ errors
by failing to check for them. I can't change that idiom for the
remaining 10%, but I can convert it into exception handling with
throw/catch. That's not ideal, but it's workable.

(I have actually thought about this before. It's on the todo list. And
it affects the nommu stuff too because allocation failures are _much_
more likely in a context where all allocations must be contiguous and
memory fragmentation limits your maximum allocation size, so malloc
failures aren't just due to resource exhaustion there...)

> Rich

Rob
_______________________________________________
Toybox mailing list
[email protected]
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Re: [Toybox] pull: fix modprobe, login, switch_root, improve init, reboot

Reply via email to