Denys Vlasenko wrote on Wed, Nov 08, 2023 at 04:50:12PM +0100:
> On Wed, Nov 8, 2023 at 3:58 AM Dominique Martinet
> <asmad...@codewreck.org> wrote:
> >
> > From: Dominique Martinet <dominique.marti...@atmark-techno.com>
> >
> > On alpine linux, calling busybox time with a nonexisting command would
> > hang forever:
> > ```
> > time: can't execute 'fdsa': No such file or directory
> > Command exited with non-zero status 127
> > real    0m 0.00s
> > user    0m 0.00s
> > sys     0m 0.00s
> > <hang>
> > ```
> >
> > This is because time uses vfork then BB_EXECVP_or_die, which ultimately
> > calls exit() after execvp fails. exit() is explicitly listed as unsafe
> > to call after vfork as the child process shares its memory with the
> > parent; in particular on musl exit will take the init_fini_lock() and
> > calling it twice in the same address space will just hang.
> > _exit() on the other hand is safe to call, so replace die_func with it.
> 
> Doesn't happen to me... maybe I have an older musl.

I've just checked old alpines -- I can reproduce this all the way from
alpine 3.10 (musl 1.1.22, busybox 1.30.1) to current edge (musl 1.2.4).
alpine 3.9 (musl 1.1.20, busybox 1.29.3) didn't hang; I didn't check all
the way but time.c didn't seem to change in that time frame so it's
probably a "new" (~2019) bug introduced by the libc change, yes.

> > For this particular problem die_func could just be overriden in
> > time.c's run_command after vfork, but this problem actually came up
> > before in shell/hush.c as per the fflush_and__exit comment so it seems
> > more appropriate to do this globally in the xvfork macro and get rid of
> > this whole class of bugs forever.
> 
> Maybe we can always use _exit(),
> or fflush_all(); _exit() in xfunc_die()?

hmm, exit does a bunch of stuff:
- we apparently use atexit() in a few applets, so these would need the
real exit() to get invoked
- any lib destructor would also be called, I'm not sure we rely on any
so that's probably fine. Same with on_exit(), not used directly.
- The man page for exit(3) on my system also says files created by
tmpfile(3) are removed; we don't seem to use it either.
- flushing and closinig all FILEs; I agree close itself isn't needed,
fflush_all() as you've suggested is probably enough... And hush.c has
been calling it after vfork so it's probably fine? because while the
address space is shared, that takes and releases locks when it's done,
so while it's not explicitely safe it's probably ok. And fflush is
definitely needed in the normal case.

I'm not sure there's anything else that needs to be done on exit?


So that leaves atexit hooks which we need in some applets (mount, sed --
the one in libpwdgrp is just a free "to make valgrind happy" so can be
skipped); if we update these two to override die_func instead (don't
even need to use exit there, they can just register their hook as
die_func instead of atexit), then I guess it would be ok to do
`fflush_all(); _exit()` in xfunc_die(), but I'm not 100% comfortable
with that so that'll require a bunch of testing.

I can update the patch if you really prefer this, but I won't be able to
test much more than running the test suite on x86_64/debian and alpines
(I guess it's a good excuse to add a new test item for 'time
notavalidcommand' at the same time), it'll probably require a bit more
validiation.
-- 
Dominique Martinet | Asmadeus
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to