Re: Pending patches
On Thu, Nov 13, 2008 at 09:32:19AM +0300, Vladimir Dronnikov wrote: (or read -t 5 AREYOUSURE) for reboot/shutdown. If admin said to reboot, we can presume he knows what he is doing. Nobody's perfect... * Is there a way to call kernel's emergency_{sync,remount}() from userspace? To get rid of userspace shutter-down. Even if it is there, that is a quite dirty method. I still see umount -a is barking Device is in use at every filesystem I'm trying to umount from reboot.sh. $ tail target/generic/target_busybox_skeleton/etc/inittab # Stuff to do for the 3-finger salute ::ctrlaltdel:/bin/echo go away # Stuff to do before rebooting null::shutdown:/usr/bin/killall klogd null::shutdown:/usr/bin/killall syslogd null::shutdown:/sbin/swapoff -a null::shutdown:/bin/umount -a -r ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
Do killall5 -KILL first. If it persists... It does :( init locks /dev (console), /proc (self/exe). /sbin/init is a symlink to /usr/bin/init, /usr is unionfs of some squashfs and /var/changes, which is the only truly storage-backed RW partition. In fact, I can not unmount anything :) Keeping fighting, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Thursday 13 November 2008 15:14, Vladimir Dronnikov wrote: Do killall5 -KILL first. If it persists... It does :( init locks /dev (console), /proc (self/exe). I thought you do umount -r Try to remount devices as read-only if mount is busy thing. From reboot POV, RO filesystems are as safe as umounted ones. Does it work? /sbin/init is a symlink to /usr/bin/init, /usr is unionfs of some squashfs and /var/changes, which is the only truly storage-backed RW partition. In fact, I can not unmount anything :) -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Thursday 13 November 2008 23:23, Vladimir Dronnikov wrote: I thought you do umount -r Try to remount devices as read-only if mount is busy thing. From reboot POV, RO filesystems are as safe as umounted ones. Does it work? Seems it does. Though I get a storm of runsv complains about they can not write to RO FSs. I can leave with this. You should kill them first. killall5 -TERM, then wait a second or so to let them die. Optionally follow with killall5 -KILL and sleep 1 for especially stubborn ones. Then umount -r. I'd like then to suggest you to remove -s option from runsvdir. It adds unnecessary bloat, not compatible and can be easily replaced with a special runsv service (say, reboot) which is initially down and wraps in its ./run script the functionality of reboot.sh. Here we go again. It's you who asked runsvdir to make possible to use as init, not my idea. I am totally happy with my init being a shell script - no additinal hacking required. runsvdir needs special hacks in order to be used as init because you want to make it stop and not restart runsv's. It's true that reboot actions (killall5 + umount) are perfectly doable by any root process (and I told so 999 times). But you do need runsvdir -s stript, at least trivial one: #!/bin/sh while true; do sleep 999; done just to make it stop respawning runsv's you are killing. (oops, problem here. second killall5 (killall5 -KILL) will *kill the script*! what to do?...) If you would run runsvdir NOT as init, but as ordinary process, then you do not need this script, because killall5 -TERM will kill runsvdir too - exactly what you need. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Tuesday 11 November 2008 12:28, Vladimir Dronnikov wrote: A couple of notes/questions: * An interesting feature of the hook script could be a confirmation dialog (or read -t 5 AREYOUSURE) for reboot/shutdown. Why? If admin said to reboot, we can presume he knows what he is doing. * Is there a way to call kernel's emergency_{sync,remount}() from userspace? To get rid of userspace shutter-down. Even if it is there, that is a quite dirty method. * Do you plan to modify vanilla BB init to keep reboot/shutdown procedure consistent with your opinion on reboot should be a shell script? No. I plan to make (or rather, to keep) both methods in working order. I won't dictate people how to run their machines. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Wed, Nov 12, 2008 at 11:58:26PM +0100, Denys Vlasenko wrote: On Tuesday 11 November 2008 12:28, Vladimir Dronnikov wrote: * Is there a way to call kernel's emergency_{sync,remount}() from userspace? To get rid of userspace shutter-down. Even if it is there, that is a quite dirty method. you could possibly use sys request to sync, but I do not remember a remount (and i don't think that it would make any sense at all to add one). But this is not acceptable to do, generally, as Denys said. ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
(or read -t 5 AREYOUSURE) for reboot/shutdown. If admin said to reboot, we can presume he knows what he is doing. Nobody's perfect... * Is there a way to call kernel's emergency_{sync,remount}() from userspace? To get rid of userspace shutter-down. Even if it is there, that is a quite dirty method. I still see umount -a is barking Device is in use at every filesystem I'm trying to umount from reboot.sh. I won't dictate people how to run their machines. Me too. Just a thought on consistency. Regards, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
* Is there a way to call kernel's emergency_{sync,remount}() from userspace? To get rid of userspace shutter-down. Even if it is there, that is a quite dirty method. Why? Kernel knows more about entire filesystem list than a particular process, I presume. you could possibly use sys request to sync, but I do not remember a remount echo u /proc/sysrq-trigger (and i don't think that it would make any sense at all to add one). But this is not acceptable to do, generally, as Denys said. I admit the point seems too narrow for general use. I'm trying to get your advise on _my_ particular situation, people. Regards, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Tuesday 11 November 2008 07:04, Vladimir Dronnikov wrote: break breaks from switch(), not from for(). Right. I just rarely saw break in switch default branch. If we'd revert the condition... Is it better this way? if (bb_got_signal == SIGHUP) { for (i = 0; i svnum; i++) if (sv[i].pid) kill(sv[i].pid, SIGTERM); } /* SIGHUP or SIGTERM (or SIGUSRn if we are init) */ /* Exit unless we are init */ if (getpid() != 1) return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; /* init continues to monitor services forever */ bb_got_signal = 0; } /* for (;;) */ } -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Mon, Nov 10, 2008 at 04:05:40PM +0300, Vladimir Dronnikov wrote: I think it would panic rather than oops. Of course. Right you are. So what does kernel in panic? see kernel/panic.c like http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob_plain;f=kernel/panic.c;hb=HEAD ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
Denys, do we mean break and not continue right before return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; at the end of runsvdir.c? -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
I think it would panic rather than oops. Of course. Right you are. So what does kernel in panic? -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
reboot -f echo Kernel has been instructed to reboot while true; do sleep ; done because otherwise, if script exits, runsvdir will loop back and restart services, and this is definitely what you dont want to happen! Wonder what happens when init dies and kernel oopses? Does kernel sync/umount filesystems? If so I'd let it be. Wonder also does kernel sync/umount filesystems when one issues reboot(whatever)? I still view some awkwardness in the procedure. You treat init as perpetuum mobile of which we have to break a detail to get it stopped. I mostly tend to treat it as a process that just exits when your system have accomplished its task. Where am I wrong, Denys? TIA, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Mon, Nov 10, 2008 at 03:11:16PM +0300, Vladimir Dronnikov wrote: reboot -f echo Kernel has been instructed to reboot while true; do sleep ; done because otherwise, if script exits, runsvdir will loop back and restart services, and this is definitely what you dont want to happen! Wonder what happens when init dies and kernel oopses? Does kernel I think it would panic rather than oops. Just boot with panic=15 ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Monday 10 November 2008 06:11:16 Vladimir Dronnikov wrote: reboot -f echo Kernel has been instructed to reboot while true; do sleep ; done because otherwise, if script exits, runsvdir will loop back and restart services, and this is definitely what you dont want to happen! Wonder what happens when init dies and kernel oopses? Does kernel sync/umount filesystems? Nope, it's a normal oops. The kernel's theory on oopses is that the system is no longer in a good state, so it should stop writing to the disk _now_ because it may be writing garbage. (However, this doesn't apply to network access, so it continues to route packets. A common trick years ago was to set up your routing tables and then have PID 1 exit so the kernel paniced, because the paniced kernel would continue to route packets with _no_userspace_running_. Darn hard to hack a system like that.) If so I'd let it be. Wonder also does kernel sync/umount filesystems when one issues reboot(whatever)? Nope, that's up to userspace to do. I still view some awkwardness in the procedure. You treat init as perpetuum mobile of which we have to break a detail to get it stopped. I mostly tend to treat it as a process that just exits when your system have accomplished its task. Where am I wrong, Denys? You're wrong in that the kernel guys have defined init as a special process that panics the kernel if it exits. This has been true for Linux for 18 years, and was true of other Unixes before that. PID 1 is special, in lots of little ways. It's the default reaper of zombie processes whose parents have exited. It's the default target of signals for processes that don't otherwise have a parent (such as anything that's called daemonize()). Various system things that must belong to a userspace process get attached to init because it's guaranteed to be there. (Back before we had so many kernel threads, interrupt routines that needed a process context would borrow PID 1's. These days, those new kernel threads have init as their parent.) Internally, the kernel has a variable init_task statically referencing PID 1's context, which is used by things like daemonize() (the kernel internal version in kernel/exit.c; I'm actually indirectly responsible for Andrew Morton creating that sucker long ago: http://lkml.indiana.edu/hypermail/linux/kernel/0105.0/0045.html). There are also a couple of different task_reparent_to_init() functions in the security subdirectory. On a design level, the purpose of the kernel is to run userspace. The kernel launches one task by hand, and then that task is in charge of the system from then on. If that task ever exits, the kernel doesn't know what it's supposed to be doing anymore, at a design level. It was decreed back in the 1970's that PID 1 _should_not_exit_, and unix has gone along with that ever since As for standards on this, see SUSv3 section 4.12, reserving PID 1 for the system: http://www.opengroup.org/onlinepubs/95399/ See the definition of _exit(), which says: http://www.opengroup.org/onlinepubs/95399/functions/_exit.html The parent process ID of all of the calling process' existing child processes and zombie processes shall be set to the process ID of an implementation-defined system process. That is, these processes shall be inherited by a special system process. (How you're supposed to do that if the special system process is the one exiting is left as an exercise for the reader.) The definition of kill() says: http://www.opengroup.org/onlinepubs/95399/functions/kill.html The unspecified processes to which a signal cannot be sent may include the scheduler or init. Keep in mind that this standard tried to be so vague that it could allow Windows NT and IBM's System 360 to be specified as Posix compliant... Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Monday 10 November 2008 06:59:19 Bernhard Reutner-Fischer wrote: Wonder what happens when init dies and kernel oopses? Does kernel I think it would panic rather than oops. Just boot with panic=15 That argument is number of seconds to wait before rebooting. If you want panic to equal reboot, panic=1 gives you a shorter delay. (I feed this argument to the kernel command line of qemu a lot. :) Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Monday 10 November 2008 15:24, Vladimir Dronnikov wrote: Denys, do we mean break and not continue right before return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; at the end of runsvdir.c? Yes: switch (bb_got_signal) { ... default: /* SIGTERM (or SIGUSRn if we are init) */ /* Exit unless we are init */ if (getpid() == 1) break; return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; } bb_got_signal = 0; } /* for (;;) */ break breaks from switch(), not from for(). We are still inside for(), we reset bb_got_signal and go back to watching and restarting services. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Monday 10 November 2008 13:11, Vladimir Dronnikov wrote: reboot -f echo Kernel has been instructed to reboot while true; do sleep ; done because otherwise, if script exits, runsvdir will loop back and restart services, and this is definitely what you dont want to happen! Wonder what happens when init dies and kernel oopses? Does kernel sync/umount filesystems? If so I'd let it be. Wonder also does kernel sync/umount filesystems when one issues reboot(whatever)? I still view some awkwardness in the procedure. You treat init as perpetuum mobile of which we have to break a detail to get it stopped. I mostly tend to treat it as a process that just exits when your system have accomplished its task. Where am I wrong, Denys? Boot with init=/bin/sh, and try exiting from that shell. You will see. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
Can you remove from there patches which are already applied? Done. -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
break breaks from switch(), not from for(). Right. I just rarely saw break in switch default branch. If we'd revert the condition... -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Saturday 08 November 2008 00:20, Vladimir Dronnikov wrote: If you want to use runsvdir as init, you run it as runsvdir [-P] -s SCRIPT /dir Works fine. Though to reboot I have to use something like: --- #!/bin/sh umount -a exec reboot -f --- as -s parameter script. Looks like a point of no return either. You probably better off to reboot -f echo Kernel has been instructed to reboot while true; do sleep ; done because otherwise, if script exits, runsvdir will loop back and restart services, and this is definitely what you dont want to happen! Yes, in theory, reboot -f never returns. In practice, I once discovered that Linux kernel ignores RB_HALT command (RB_AUTOBOOT and RB_POWERDOWN worked) _and_ returns_ for the reboot syscall. I wrote a bug report about it and mailed it to lkml, still using post-halt machine for this. :) This was years ago, I hope it is fixed now; yet, still feeling paranoid about reboot -f does not return :) -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
On Friday 07 November 2008 08:04, Vladimir Dronnikov wrote: http://drvv.ru/busybox/sv.patch I don't like this, but ok. Applied. I'll explain why I need it. I use totally volatile /var. My services are in /etc, under which I place all scripts. We have config option for udhcpc helper script location -- we now have configurable root of services. Thanks! You already told me this. Looks like you forgot my answer: sv path/with/slashes ACTION will work for you if you keep service not in /var/service. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
On Friday 07 November 2008 08:04, Vladimir Dronnikov wrote: http://drvv.ru/busybox/acpid.patch + argv += optind; + // daemonize unless -d given + if (!(option_mask32 1)) { // ! -d + forkexit_or_rexec(argv); (1) argv is wrong here, you did argv += optind too early. Right. (2) forkexit_or_rexec() in general is an internal function. Use bb_daemonize_or_rexec() instead, it does this and also all of the below... + setsid(); + // close 0, 1? + // ? + // reopen stderr + freopen(opt_logfile, w, stderr); + } I don't want /dev/null to be my stderr. bb_daemonize_or_rexec() would do unnecessary things here. Yes, extra close and dup2. Two microseconds wasted per each acpid startup. Horror. ;) but more importantly, why daemonizing at all? This works: setsid acpid ... /dev/null /dev/null 2somewhere We daemonize in inetd et.al. only because that is their historical behavior... is daemonization something you need for compat reasons? Personally, I use acpid as runit service. Daemonizing was requested as it is default behaviour of big brother. Then it's ok. + while (!bb_got_signal safe_poll(pfd, nfd, -1) 0) { safe_poll() will loop on EINTR. Thus SIGINT will not interrupt it, it may sit like this forever. It is intended to be SIGTERMed to exit. As other services do. Yes, in the code above SIGTERM will not kill it either. It will exit only after poll() will see some input, and THEN program will notice what SIGTERM was received. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
+ freopen(opt_logfile, w, stderr); unnecessary things here. Yes, extra close and dup2. Two microseconds wasted per each acpid startup. Horror. ;) :)) No. It closes fd 2 (or dup's /dev/null to fd 2). Then, if freopen() fails, we do not see ANY output... That is my concern... Personally, I use acpid as runit service. Daemonizing was requested as it is default behaviour of big brother. Then it's ok. I still can't help feeling we should not daemonize/use-custom-log-file at all. Where is KISS?! Vote pro or contra, please... Yes, in the code above SIGTERM will not kill it either. It will exit only after poll() will see some input, and THEN program will notice what SIGTERM was received. Weird. I start and stop it by means of sv, and it works... -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
I still can't help feeling we should not daemonize/use-custom-log-file at all. Where is KISS?! Vote pro or contra, please... ...SIGTERM will not kill it either Updated all issues you pointed out: http://drvv.ru/busybox/acpid.patch Also fixed typo in http://drvv.ru/busybox/httpd.patch Moved functionality from http://drvv.ru/busybox/rdev.patch to http://drvv.ru/busybox/mountpoint.patch Please, take a look and consider applying. TIA, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
If you want to use runsvdir as init, you run it as runsvdir [-P] -s SCRIPT /dir Works fine. Though to reboot I have to use something like: --- #!/bin/sh umount -a exec reboot -f --- as -s parameter script. Looks like a point of no return either. Continuing testing, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Friday 07 November 2008 17:20:19 Vladimir Dronnikov wrote: If you want to use runsvdir as init, you run it as runsvdir [-P] -s SCRIPT /dir Works fine. Though to reboot I have to use something like: --- #!/bin/sh umount -a exec reboot -f --- as -s parameter script. Random side note: you can't call reboot() from PID 1, because on older kernels the reboot system call caused the process to exit, and PID 1 exiting the kernel, and the kernel would panic before the shutdown actually had a chance to happen... So yes, you have to fork() in order to actually call reboot(), for weird historical reasons. Rob P.S. Weird historical footnote #2: the magic numbers you feed to reboot() are 0xfee1dead for the first number, and the four options for the second number, when listed in hexadecimal, are the birthdays of Linus Torvalds and each of his three daughters. (Your are telling linus to shut down, or Linus 2.0, 3.0, or 4.0, respectively.) (No, I don't know if his daughters' machines are each set up to shut down using the appropriate code, but I wouldn't put it past him...) ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
Why then many people are coming here with my reboot doesn't work then? BB reboot indeed doesn't work with runsvdir run as init. Either should I disable reboot applet (thus losing reboot -f capability) and write reboot script or should I stop using runsvdir? I want to get the most from BB thus both the ways are unacceptable. To work the problem around I propose to teach runsvdir'ed init to call a hook script, say, /etc/reboot to perform system pre-reboot housekeeping. That way I still can use both BB tools and yield anyone's wishes to shutdown/reboot his/her system in a flexible way. What do you think, colleages? If I write portable reboot (one which does not know what kind of init is on the system), what should I do? Use signals? Or talk to /dev/initctl? Or both? BB reboot should be targeted primarily to BB init. Thus, signals. No need to borrow bad habits. TIA, -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Thursday 06 November 2008 18:37, Vladimir Dronnikov wrote: Why then many people are coming here with my reboot doesn't work then? BB reboot indeed doesn't work with runsvdir run as init. How do you propose to make it work? What should it do on the receipt of reboot signal? Either should I disable reboot applet (thus losing reboot -f capability) and write reboot script or should I stop using runsvdir? I want to get the most from BB thus both the ways are unacceptable. To work the problem around I propose to teach runsvdir'ed init to call a hook script, say, /etc/reboot to perform system pre-reboot housekeeping. I have an idea. As it stands now runsvdir has SIGHUP and SIGTERM handlers, and we just added SIGUSR1 and SIGUSR2 if runsvdir is run under PID 1. Getting it to handle reboot amounts simply to running a script on any of these signals, wait for it to finish, and then do what runsvdir does now (i.e. exit) if we are not PID 1, else continue running. as init exiting is a bad idea. If you want to use runsvdir as init, you run it as runsvdir [-P] -s SCRIPT /dir (perhaps exec'ing this command at the end of some sort of startup script). See attached patch. Testing... # echo $$; exec ./busybox runsvdir ./z -s /bin/echo 17479 Simulating reboot request (in another xterm): # kill -TERM 17479 In first xterm, script /bin/echo runs and prints 15 (signo of SIGTERM), and then runsvdir exits. -- vda diff -d -urpN busybox.5/runit/runsvdir.c busybox.6/runit/runsvdir.c --- busybox.5/runit/runsvdir.c 2008-10-31 03:36:16.0 +0100 +++ busybox.6/runit/runsvdir.c 2008-11-06 23:18:18.0 +0100 @@ -107,7 +107,7 @@ static NOINLINE pid_t runsv(const char * } if (pid == 0) { /* child */ - if (option_mask32) /* -P option? */ + if (option_mask32 1) /* -P option? */ setsid(); /* man execv: * Signals set to be caught by the calling process image @@ -217,17 +217,20 @@ int runsvdir_main(int argc UNUSED_PARAM, time_t last_mtime = 0; int wstat; int curdir; - int pid; + pid_t pid; unsigned deadline; unsigned now; unsigned stampcheck; int i; int need_rescan = 1; + char *opt_s_argv[3]; INIT_G(); opt_complementary = -1; - getopt32(argv, P); + opt_s_argv[0] = NULL; + opt_s_argv[2] = NULL; + getopt32(argv, Ps:, opt_s_argv[0]); argv += optind; bb_signals(0 @@ -335,7 +338,6 @@ int runsvdir_main(int argc UNUSED_PARAM, pfd[0].revents = 0; #endif deadline = (need_rescan ? 1 : 5); - do_sleep: sig_block(SIGCHLD); #if ENABLE_FEATURE_RUNSVDIR_LOG if (rplog) @@ -357,27 +359,37 @@ int runsvdir_main(int argc UNUSED_PARAM, } } #endif + if (!bb_got_signal) + continue; + + /* -s SCRIPT: useful if we are init. + * In this case typically script never returns, + * it halts/powers off/reboots the system. */ + if (opt_s_argv[0]) { + /* Single parameter: signal# */ + opt_s_argv[1] = utoa(bb_got_signal); + pid = spawn(opt_s_argv); + if (pid 0) { +/* Remebering to wait for _any_ children, + * not just pid */ +while (wait(NULL) != pid) + continue; + } + } + switch (bb_got_signal) { - case 0: /* we are not signaled, business as usual */ - break; case SIGHUP: for (i = 0; i svnum; i++) if (sv[i].pid) kill(sv[i].pid, SIGTERM); - /* fall through */ - case SIGTERM: - /* exit, unless we are init */ - if (getpid() != 1) -goto ret; - default: - /* so we are init. do not exit, - * and pause respawning - we may be rebooting - * (but SIGHUP is not a reboot, make short pause) */ - deadline = (SIGHUP == bb_got_signal) ? 5 : 60; - bb_got_signal = 0; - goto do_sleep; + /* Fall through */ + default: /* SIGTERM (or SIGUSRn if we are init) */ + /* Exit unless we are init */ + if (getpid() == 1) +break; + return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; } - } - ret: - return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; + + bb_got_signal = 0; + } /* for (;;) */ } ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
On Tuesday 04 November 2008 11:49, Vladimir Dronnikov wrote: Can you provide a brief comment on the rest of pending patches, Denys? http://drvv.ru/busybox/runsvdir1.patch Applied http://drvv.ru/busybox/sv.patch I don't like this, but ok. Applied. http://drvv.ru/busybox/runsvdir.patch Please try current svn, runsvdir -s SCRIPT method. If you are ok with it, I'll update help text and make it official. http://drvv.ru/busybox/httpd.patch Bug - cur-after_colon is leaked: - cur-after_colon = strchr(cur-before_colon, ':'); - *cur-after_colon++ = '\0'; + cur-after_colon = xstrdup(token[1]); http://drvv.ru/busybox/rdev.patch This would be an incompatible change to rdev applet. I don't want to do this... http://drvv.ru/busybox/acpid.patch + argv += optind; + // daemonize unless -d given + if (!(option_mask32 1)) { // ! -d + forkexit_or_rexec(argv); (1) argv is wrong here, you did argv += optind too early. (2) forkexit_or_rexec() in general is an internal function. Use bb_daemonize_or_rexec() instead, it does this and also all of the below... + setsid(); + // close 0, 1? + // ? + // reopen stderr + freopen(opt_logfile, w, stderr); + } but more importantly, why daemonizing at all? This works: setsid acpid ... /dev/null /dev/null 2somewhere We daemonize in inetd et.al. only because that is their historical behavior... is daemonization something you need for compat reasons? On the same grounds -l LOGFILE is not needed, 2LOGFILE does the job. + for (i = 0; i argc; i++) { + pfd[nfd].fd = open_or_warn(argv[i], O_RDONLY); + pfd[nfd].events = POLLIN; + nfd++; + } if open fails, you probably do not want to use fd == -1 in poll()? + while (!bb_got_signal safe_poll(pfd, nfd, -1) 0) { safe_poll() will loop on EINTR. Thus SIGINT will not interrupt it, it may sit like this forever. You need simple poll(). +config ACPID + bool acpid + default n + select RUN_PARTS = Using run-parts does not mean you must force user to have run-parts applet *in the same binary*. http://drvv.ru/busybox/mailutils.patch Applied, since sendmail-related applets are yours. :) -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: pending patches
http://drvv.ru/busybox/sv.patch I don't like this, but ok. Applied. I'll explain why I need it. I use totally volatile /var. My services are in /etc, under which I place all scripts. We have config option for udhcpc helper script location -- we now have configurable root of services. Thanks! http://drvv.ru/busybox/httpd.patch Bug - cur-after_colon is leaked: - cur-after_colon = strchr(cur-before_colon, ':'); - *cur-after_colon++ = '\0'; + cur-after_colon = xstrdup(token[1]); Will revisit. http://drvv.ru/busybox/rdev.patch This would be an incompatible change to rdev applet. I don't want to do this... Could we then add -n option to mountpoint which would print the name of block device a directory is mounted on? Bernhard asked the mountpoint' author -- no reply AFAIK... http://drvv.ru/busybox/acpid.patch + argv += optind; + // daemonize unless -d given + if (!(option_mask32 1)) { // ! -d + forkexit_or_rexec(argv); (1) argv is wrong here, you did argv += optind too early. Right. (2) forkexit_or_rexec() in general is an internal function. Use bb_daemonize_or_rexec() instead, it does this and also all of the below... + setsid(); + // close 0, 1? + // ? + // reopen stderr + freopen(opt_logfile, w, stderr); + } I don't want /dev/null to be my stderr. bb_daemonize_or_rexec() would do unnecessary things here. but more importantly, why daemonizing at all? This works: setsid acpid ... /dev/null /dev/null 2somewhere We daemonize in inetd et.al. only because that is their historical behavior... is daemonization something you need for compat reasons? Personally, I use acpid as runit service. Daemonizing was requested as it is default behaviour of big brother. I'd throw away intrinsic daemonizing if you support me. + for (i = 0; i argc; i++) { + pfd[nfd].fd = open_or_warn(argv[i], O_RDONLY); + pfd[nfd].events = POLLIN; + nfd++; + } if open fails, you probably do not want to use fd == -1 in poll()? Surely no :) + while (!bb_got_signal safe_poll(pfd, nfd, -1) 0) { safe_poll() will loop on EINTR. Thus SIGINT will not interrupt it, it may sit like this forever. It is intended to be SIGTERMed to exit. As other services do. +config ACPID + bool acpid + default n + select RUN_PARTS = Using run-parts does not mean you must force user to have run-parts applet *in the same binary*. Right. It is just a hint. Sorry, I'm trying to BBify everyone and everything :) http://drvv.ru/busybox/mailutils.patch Applied, since sendmail-related applets are yours. :) Great! Thanks! Let us test it. -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Friday 31 October 2008 20:15:08 Denys Vlasenko wrote: On Saturday 01 November 2008 01:58, Rob Landley wrote: A process has a file open in it. The filesystem is pinned until the process closes that file, unless you want to force the unmount (so the file starts getting a -ESOMETHINGOROTHER). This is not a problem. killall5 -KILL closes a lot of open files. If any of the ancestral processes that execed your shutdown thingy didn't close a filehandle (and didn't set it close on exec), your shutdown thing could itself have a file open and not know about it. PID 1 has no ancestors that have execed it, it came from the kernel. Is there not a way to close all filehandles you inherited from your parent? Grovel around in /proc to see what they are, assuming /proc is mounted? But it's more complicated than that: http://lwn.net/Articles/236843/ http://lwn.net/Articles/237722/ When I say that signaling PID 1 so it can quiesce and shutdown the system for you is the easy way to do it right, I really am serious. If you want to continue on after this one, we could start talking about how any use of the new container infrastructure prevents what you want to do from working properly. (Go to http://lwn.net/Kernel/Index/ and search for containers, there's a dozen or so articles on its development.) Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Saturday 01 November 2008 07:40, Rob Landley wrote: When I say that signaling PID 1 so it can quiesce and shutdown the system for you is the easy way to do it right, I really am serious. Why then many people are coming here with my reboot doesn't work then? init authors (and I am speaking not only about bbox init, but sysV one too) didn't even manage to come to a coherent solution HOW to signal init! IIRC SysV init has a fifo (!) which you can talk into. How stupid - now suddenly you require a place in fs where that fifo might be created. bbox init uses signals. If I write portable reboot (one which does not know what kind of init is on the system), what should I do? Use signals? Or talk to /dev/initctl? Or both? -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Saturday 01 November 2008 06:41:04 Denys Vlasenko wrote: On Saturday 01 November 2008 07:40, Rob Landley wrote: When I say that signaling PID 1 so it can quiesce and shutdown the system for you is the easy way to do it right, I really am serious. Why then many people are coming here with my reboot doesn't work then? A) because people boot with init=/bin/sh even though they built init into busybox, and then when they try to use shutdown it sends a signal to PID 1 that gets ignored. (One of the special things about PID 1 is that its default handler for all signals is SIG_IGN, including kill -9.) B) because it used to be really buggy, circa 1.1.x and earlier. C) because the method of signaling init isn't quite standardized and they mix and match shutdown and init commands between busybox and non-busybox (the two have to agree on whether they're signalling via kill or whether they're signalling via /dev/initctl, or something else entirely.) This is sort of a special case of (A), really. Basically our shutdown should be able to figure out that it didn't successfully signal init and at least give an error message. Unfortunately, there's no inherent response to signals back to the sending process. (This is one of the reasons /dev/initctl was invented.) init authors (and I am speaking not only about bbox init, but sysV one too) didn't even manage to come to a coherent solution HOW to signal init! IIRC SysV init has a fifo (!) which you can talk into. How stupid - now suddenly you require a place in fs where that fifo might be created. bbox init uses signals. And if you booted with init=/bin/sh and it hasn't registered a handler for that signal, then it gets silently ignored and your shutdown silently fails and you get an email asking why. At least with the fifo, you can see it's not _there_ and maybe tell the user (can't signal init, try reboot -f). If I write portable reboot (one which does not know what kind of init is on the system), what should I do? Use signals? Or talk to /dev/initctl? Or both? Generally you fallback from /dev/initctl to sending the signal, but warn when doing it so that the user may need to use shutdown -f. (Or else wait a while and then do the force shutdown yourself, on the theory that init will kill you before then if it's working.) Generally shutdown scripts do a killall -TERM, letting all the daemons know to save state and exit, waits a few seconds, then do a killall -9, and then quiesce the rest of the system. How this interacts with network mounts is a problem for shutdown script writers... How long to wait is, of course, one of those big imponderables... -- vda Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Friday 31 October 2008 03:57, Rob Landley wrote: Ok, when you said free loop devices, I thought you meant losetup -d. You're talking about unmounting loop devices, which is a standard unmounting problem. Ok, an example where I don't even mount a loopdevice, but still can't remount my root RO as long as loopdevice eixsts: bash-3.2# mount -o remount,rw / bash-3.2# dd if=/dev/zero bs=1M count=10 /z 10+0 records in 10+0 records out bash-3.2# mke2fs /z mke2fs 1.34 (25-Jul-2003) /z is not a block special device. Proceed anyway? (y,n) y Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) 2560 inodes, 10240 blocks ... bash-3.2# losetup /dev/loop0 /z bash-3.2# mount -o remount,ro / mount: mounting /dev/root on / failed: Device or resource busy Looks like robust reboot procedure should try to free all loop devices in order to avoid this. Should we patch init here? /* No inittab file -- set up some default behavior */ if (parser == NULL) { /* Reboot on Ctrl-Alt-Del */ new_init_action(CTRLALTDEL, reboot, ); /* Umount all filesystems on halt/reboot */ new_init_action(SHUTDOWN, umount -a -r, ); /* Swapoff on halt/reboot */ if (ENABLE_SWAPONOFF) new_init_action(SHUTDOWN, swapoff -a, ); There's a umount -a in there already, which was there in busybox 1.2. If the busybox umount -a isn't unmounting everything properly, that's a separate bug. What umount -r -a should do if remounting RO fails? losetup -d every possible loopdevice and retry? Note that mounts are process context these days, so if you kill all proccesses except PID 1, it should naturally reference count those per-process mounts down to zero and free them, which just leaves the PID 1 mounts that umount -a should be able to get. In the case we are using per-process mounts, yes, killing processes helps with umount. Note that if you try to do this from a process _other_ than PID 1, or without having killed all the other processes on the system, then you're not guaranteed to have umounted every active filesystem. We do it not from process 1 but from it's children anyway. Not much difference. See the bit about about some resources being per-processes, so you have to kill all the processes other than PID 1 in order to make sure the reference counts of those resources drop to zero and the system can be quiesced? Yes, I have to kill all processes. killall5 does that nicely. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Friday 31 October 2008 04:29:54 Denys Vlasenko wrote: On Friday 31 October 2008 03:57, Rob Landley wrote: Ok, when you said free loop devices, I thought you meant losetup -d. You're talking about unmounting loop devices, which is a standard unmounting problem. Ok, an example where I don't even mount a loopdevice, but still can't remount my root RO as long as loopdevice eixsts: Yup, losetup associating the file with the loop device node pins the device, just like any other process having an open file in the filesystem would. There's a umount -a in there already, which was there in busybox 1.2. If the busybox umount -a isn't unmounting everything properly, that's a separate bug. What umount -r -a should do if remounting RO fails? losetup -d every possible loopdevice and retry? This is really a kernel issue. (I do note the existence of umount -f, but that's probably not something you want to do on /.) Ask the kernel guys. Being unable to remount / read only seems like a bug to me, no matter what the reason. Note that mounts are process context these days, so if you kill all proccesses except PID 1, it should naturally reference count those per-process mounts down to zero and free them, which just leaves the PID 1 mounts that umount -a should be able to get. In the case we are using per-process mounts, yes, killing processes helps with umount. Just having _files_ open prevents umounting a mount without -f: sudo /bin/bash mkdir walrus mount -t ramfs walrus walrus touch walrus/walrus sleep 999 walrus/walrus umount walrus A process has a file open in it. The filesystem is pinned until the process closes that file, unless you want to force the unmount (so the file starts getting a -ESOMETHINGOROTHER). (Having a process with its current working directory in that filesystem also pins it.) Note that if you try to do this from a process _other_ than PID 1, or without having killed all the other processes on the system, then you're not guaranteed to have umounted every active filesystem. We do it not from process 1 but from it's children anyway. Not much difference. So instead of signaling PID 1 to do the work, you signal PID 1 to stop respawning processes and have some other script do it. Hoping that script isn't in a chroot, hoping that script isn't running in a process context with per process mounts... See the bit about about some resources being per-processes, so you have to kill all the processes other than PID 1 in order to make sure the reference counts of those resources drop to zero and the system can be quiesced? Yes, I have to kill all processes. killall5 does that nicely. killall5 can't kill PID 1, and if your non-init process doesn't kill itself then you didn't kill all processes. PID 1 is the only process that can ever be the only process on the system. By definition. Therefore, unless PID 1 is quiescing the system you have two sets of per-process resources to free. -- vda Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Friday 31 October 2008 11:56, Rob Landley wrote: Just having _files_ open prevents umounting a mount without -f: sudo /bin/bash mkdir walrus mount -t ramfs walrus walrus touch walrus/walrus sleep 999 walrus/walrus umount walrus A process has a file open in it. The filesystem is pinned until the process closes that file, unless you want to force the unmount (so the file starts getting a -ESOMETHINGOROTHER). This is not a problem. killall5 -KILL closes a lot of open files. Loop devices are worse because they interfere even after one kills off all processes. We do it not from process 1 but from it's children anyway. Not much difference. So instead of signaling PID 1 to do the work, you signal PID 1 to stop respawning processes and have some other script do it. Yes. Or I have something else respawning processes, not init. It's easier to deal with it if you can kill it. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Friday 31 October 2008 18:33:29 Denys Vlasenko wrote: On Friday 31 October 2008 11:56, Rob Landley wrote: Just having _files_ open prevents umounting a mount without -f: sudo /bin/bash mkdir walrus mount -t ramfs walrus walrus touch walrus/walrus sleep 999 walrus/walrus umount walrus A process has a file open in it. The filesystem is pinned until the process closes that file, unless you want to force the unmount (so the file starts getting a -ESOMETHINGOROTHER). This is not a problem. killall5 -KILL closes a lot of open files. If any of the ancestral processes that execed your shutdown thingy didn't close a filehandle (and didn't set it close on exec), your shutdown thing could itself have a file open and not know about it. PID 1 has no ancestors that have execed it, it came from the kernel. There are lots and lots and lots of little ways the thing you're describing can fail. There honestly _is_ a reason that people have spent all these years having PID 1 perform the shutdown. Honest and truly. We do it not from process 1 but from it's children anyway. Not much difference. So instead of signaling PID 1 to do the work, you signal PID 1 to stop respawning processes and have some other script do it. Yes. Or I have something else respawning processes, not init. It's easier to deal with it if you can kill it. You're arguing from your conclusion. (If PID 1 isn't in control of the shutdown, then it's easier not to have PID 1 be in control of the shutdown.) -- vda Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Saturday 01 November 2008 01:58, Rob Landley wrote: A process has a file open in it. The filesystem is pinned until the process closes that file, unless you want to force the unmount (so the file starts getting a -ESOMETHINGOROTHER). This is not a problem. killall5 -KILL closes a lot of open files. If any of the ancestral processes that execed your shutdown thingy didn't close a filehandle (and didn't set it close on exec), your shutdown thing could itself have a file open and not know about it. PID 1 has no ancestors that have execed it, it came from the kernel. Is there not a way to close all filehandles you inherited from your parent? -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Thursday 30 October 2008 06:07, Rob Landley wrote: On Wednesday 29 October 2008 07:04:50 Denys Vlasenko wrote: On Wed, Oct 29, 2008 at 7:48 AM, Vladimir Dronnikov [EMAIL PROTECTED] wrote: + // umount -a + // ??? Do we really want to hard-reboot without remounting RO or unmounting filesystems? I must expand // ??? to be the real umounting. I got stuck here hesitating whether to just spawn(umount -a) or try to reimplement umount -a by hand... In this spot, don't you have a feeling that it's just _impossible_ to devise a perfect reboot code *here*? Simply because you can't know what user might want to do before reboot. Should you try to free loop devices? Nope, kernel does that. It does not. I have my root device mounted RO. My kernel is 2.6.26. Let's experiment. # mount -o remount,rw / # dd if=/dev/zero bs=1M count=10 /z 10+0 records in 10+0 records out # mke2fs /z mke2fs 1.34 (25-Jul-2003) z is not a block special device. Proceed anyway? (y,n) y Filesystem label= OS type: Linux ... # mkdir /tmp/z # mount -o loop /z /tmp/z # mount -o remount,ro / mount: mounting /dev/root on / failed: Device or resource busy Oops... Maybe if we will remount loop device RO? # mount -o remount,ro /dev/loop0 # mount -o remount,ro / mount: mounting /dev/root on / failed: Device or resource busy Still nothing. Looks like robust reboot procedure should try to free all loop devices in order to avoid this. Should we patch init here? /* No inittab file -- set up some default behavior */ if (parser == NULL) { /* Reboot on Ctrl-Alt-Del */ new_init_action(CTRLALTDEL, reboot, ); /* Umount all filesystems on halt/reboot */ new_init_action(SHUTDOWN, umount -a -r, ); /* Swapoff on halt/reboot */ if (ENABLE_SWAPONOFF) new_init_action(SHUTDOWN, swapoff -a, ); If not, unfreed loop devices might prevent clean unmounts. If so, that's a kernel bug. If it's a bug, it is not yet fixed. People have been somehow getting by with init for 30 years. This would seem to imply it can be done. I am not removing stuff from init. I am trying to avoid adding the same awkward method of rebooting to runsvdir. Allow system admin to have a convenient place to tailor his reboot sequence to his needs. Shutdown signals init, init has a script it can call to quiesce the system. Only one detour? Let's start ten processes and pass a few different signals between them. That will open way to more interesting bugs. If you don't notify init not to respawn things, then your script will kill things and init will re-launch them and the system will never quiesce. E-xac-tly. That's why I just taught runsvdir to become quescent on reboot. That's the minimum I need to do, and I intend to do only the minimum. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Wednesday 29 October 2008 13:18, Vladimir Dronnikov wrote: The patch I committed does the above. If runsvdir has PID 1, it will not exit on any signal, but will sleep for 60 seconds. No it is point of no return. OK again. If not (it was kill 1 thing, not reboot.sh), we resume monitoring of our service directory after one minute (being rather confused - what was that?). It is restart, a dirt-cheap-archieved feature allowing to cleanly restart all services without actual rebooting. In all honesty, it was already there in SIGHUP case, and I broke it for PID == 1 in a sense that now it sleeps for the whole minute (too long) after mass killing :) Will fix it now. -- vda ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Thursday 30 October 2008 21:24:53 Denys Vlasenko wrote: On Thursday 30 October 2008 06:07, Rob Landley wrote: On Wednesday 29 October 2008 07:04:50 Denys Vlasenko wrote: On Wed, Oct 29, 2008 at 7:48 AM, Vladimir Dronnikov [EMAIL PROTECTED] wrote: + // umount -a + // ??? Do we really want to hard-reboot without remounting RO or unmounting filesystems? I must expand // ??? to be the real umounting. I got stuck here hesitating whether to just spawn(umount -a) or try to reimplement umount -a by hand... In this spot, don't you have a feeling that it's just _impossible_ to devise a perfect reboot code *here*? Simply because you can't know what user might want to do before reboot. Should you try to free loop devices? Nope, kernel does that. It does not. I have my root device mounted RO. My kernel is 2.6.26. Let's experiment. Ok, when you said free loop devices, I thought you meant losetup -d. You're talking about unmounting loop devices, which is a standard unmounting problem. Looks like robust reboot procedure should try to free all loop devices in order to avoid this. Should we patch init here? /* No inittab file -- set up some default behavior */ if (parser == NULL) { /* Reboot on Ctrl-Alt-Del */ new_init_action(CTRLALTDEL, reboot, ); /* Umount all filesystems on halt/reboot */ new_init_action(SHUTDOWN, umount -a -r, ); /* Swapoff on halt/reboot */ if (ENABLE_SWAPONOFF) new_init_action(SHUTDOWN, swapoff -a, ); There's a umount -a in there already, which was there in busybox 1.2. If the busybox umount -a isn't unmounting everything properly, that's a separate bug. Note that mounts are process context these days, so if you kill all proccesses except PID 1, it should naturally reference count those per-process mounts down to zero and free them, which just leaves the PID 1 mounts that umount -a should be able to get. Note that if you try to do this from a process _other_ than PID 1, or without having killed all the other processes on the system, then you're not guaranteed to have umounted every active filesystem. People have been somehow getting by with init for 30 years. This would seem to imply it can be done. I am not removing stuff from init. I am trying to avoid adding the same awkward method of rebooting to runsvdir. Ah, no problem. I don't care about runsvdir, so I'll drop this thread. Allow system admin to have a convenient place to tailor his reboot sequence to his needs. Shutdown signals init, init has a script it can call to quiesce the system. Only one detour? Let's start ten processes and pass a few different signals between them. That will open way to more interesting bugs. See the bit about about some resources being per-processes, so you have to kill all the processes other than PID 1 in order to make sure the reference counts of those resources drop to zero and the system can be quiesced? (Also to make sure the other processes aren't opening new instances of things...) PID 1 is special. PID 1 has been special for years. If you ever need to make sure you're the only process running on the system, you must be PID 1 or you're doomed to failure. If you don't notify init not to respawn things, then your script will kill things and init will re-launch them and the system will never quiesce. E-xac-tly. That's why I just taught runsvdir to become quescent on reboot. That's the minimum I need to do, and I intend to do only the minimum. If you don't understand how the current linux filesystem works, sure. We can just say busybox doesn't support per-process mount points. Or chroot, for that matter. Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
+ // umount -a + // ??? Do we really want to hard-reboot without remounting RO or unmounting filesystems? I must expand // ??? to be the real umounting. I got stuck here hesitating whether to just spawn(umount -a) or try to reimplement umount -a by hand... Next, kernel has SysRQ magic Alt+PrnScr+U sequence which, when enabled, remounts all FSs RO. May be track it down to know which system call is used and just issue it from runsvdir? What if user disagrees with our reboot sequence (needs bigger delays, wants to free lop devices, etc?). Yes, he can code up different reboot himself, as a shell script. But this exact *our* sequence can be set up exactly the same way as a shell script, no need to patch runsvdir! Right? Of course. But there are reasonable defaults, we should hardcode them and then make user able to change them, as vanilla init does respecting shutdown record. I use exec runsvdir at the end of my initramfs /init script to make runsvdir true init (PID=1), so no return or custom scripting after runsvdir is possible. People who want custom actions should not send init a signal, but rather use sv exit ... technique. In order to facilitatte this, we just need to make sure we do not crash kernel by exiting. kernel doesn't like init exiting. I think it is normal to have init exited and kernel just OOPSed... How about attached patch? Am trying... -- Vladimir ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Wednesday 29 October 2008 07:04:50 Denys Vlasenko wrote: On Wed, Oct 29, 2008 at 7:48 AM, Vladimir Dronnikov [EMAIL PROTECTED] wrote: + // umount -a + // ??? Do we really want to hard-reboot without remounting RO or unmounting filesystems? I must expand // ??? to be the real umounting. I got stuck here hesitating whether to just spawn(umount -a) or try to reimplement umount -a by hand... In this spot, don't you have a feeling that it's just _impossible_ to devise a perfect reboot code *here*? Simply because you can't know what user might want to do before reboot. Should you try to free loop devices? Nope, kernel does that. If not, unfreed loop devices might prevent clean unmounts. If so, that's a kernel bug. Should you send a message to all logged-in consoles? Historically optional behavior, and irrelevant to cleanly shutting down the system. That's a userspace thing. What about not killing openvpn tunnel carrying your NFS traffic *before* you try unmounting, as otherwise unmounting will fail, Remember the whole thing about nfs is stateless? (A stateless filesystem. Heh.) *and* many other things too because paging-in of executable pages will stop? There are so many different scenarios that you simply can't write code which will be good for them all. People have been somehow getting by with init for 30 years. This would seem to imply it can be done. The approach I advocate is - don't do it *here*. Here you need to only make sure that you do not screw up reboot by doing something silly (like exiting and making kernel oops). Everything else should be done by the reboot command (or most likely, it will be a shell script), not by init (process with PID 1). Historically, Init calls a shutdown script to quiesce the system before actually shutting it down. Repeat after me - reboot should be a reboot.sh. Nope. Allow system admin to have a convenient place to tailor his reboot sequence to his needs. Shutdown signals init, init has a script it can call to quiesce the system. If you don't notify init not to respawn things, then your script will kill things and init will re-launch them and the system will never quiesce. You have to coordinate a shutdown with PID 1, so it might as well be initiated by PID 1. Rob ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox
Re: Pending patches
On Monday 27 October 2008 09:53, Vladimir Dronnikov wrote: Hello, Denys! I've refreshed the patches at http://drvv.ru/busybox Please, take a look Looking at http://drvv.ru/busybox/runsvdir.patch On the nitpicks level, I propose having less #if ENABLE_FEATURE_RUNSVDIR_INIT statements, by using constant-zero is_init if it is not. See attached patch. Second, also easy thing - please use /* */ comments as the rest of this file does. Harder problems: + // let services exit cleanly + sync(); sleep(5); sync() might block for long. With some stupid situations (runaway process creating more and more dirty data + kernel buglet), it may never return. I prefer syncing without waiting, say, by execing sync() from a child. + // kill survived services + killall(SIGKILL); + sync(); sleep(1); + // umount -a + // ??? + // reboot + reboot( + (SIGUSR1 == bb_got_signal) ? RB_HALT_SYSTEM : + (SIGUSR2 == bb_got_signal) ? RB_POWER_OFF : + RB_AUTOBOOT + ); Do we really want to hard-reboot without remounting RO or unmounting filesystems? What if user disagrees with our reboot sequence (needs bigger delays, wants to free lop devices, etc?). Yes, he can code up different reboot himself, as a shell script. But this exact *our* sequence can be set up exactly the same way as a shell script, no need to patch runsvdir! Right? -- vda diff -d -urpN busybox.5/runit/Config.in busybox.6/runit/Config.in --- busybox.5/runit/Config.in 2008-10-26 02:02:35.0 +0200 +++ busybox.6/runit/Config.in 2008-10-29 03:22:06.0 +0100 @@ -20,6 +20,13 @@ config RUNSVDIR a directory, in the services directory dir, up to a limit of 1000 subdirectories, and restarts a runsv process if it terminates. +config FEATURE_RUNSVDIR_INIT + bool Run as init + depends on RUNSVDIR + default n + help + Act as a simple init replacement if run as PID 1. + config FEATURE_RUNSVDIR_LOG bool Enable scrolling argument log depends on RUNSVDIR diff -d -urpN busybox.5/runit/runsvdir.c busybox.6/runit/runsvdir.c --- busybox.5/runit/runsvdir.c 2008-10-26 02:02:35.0 +0200 +++ busybox.6/runit/runsvdir.c 2008-10-29 03:37:49.0 +0100 @@ -28,11 +28,13 @@ ADVISED OF THE POSSIBILITY OF SUCH DAMAG /* Busyboxed by Denys Vlasenko [EMAIL PROTECTED] */ /* TODO: depends on runit_lib.c - review and reduce/eliminate */ -#include sys/poll.h -#include sys/file.h #include libbb.h #include runit_lib.h +#if ENABLE_FEATURE_RUNSVDIR_INIT +#include sys/reboot.h +#endif + #define MAXSERVICES 1000 /* Should be not needed - all dirs are on same FS, right? */ @@ -96,6 +98,14 @@ static void warnx(const char *m1) } #endif +static void killall(int signo) +{ + int i; + for (i = 0; i svnum; i++) + if (sv[i].pid) + kill(sv[i].pid, signo); +} + static void runsv(int no, const char *name) { pid_t pid; @@ -116,8 +126,12 @@ static void runsv(int no, const char *na if (set_pgrp) setsid(); bb_signals(0 - + (1 SIGHUP) - + (1 SIGTERM) + | (1 SIGHUP) + | (1 SIGTERM) +#if ENABLE_FEATURE_RUNSVDIR_INIT + | (1 SIGUSR1) + | (1 SIGUSR2) +#endif , SIG_DFL); execvp(prog[0], prog); fatal2_cannot(start runsv , name); @@ -220,6 +234,11 @@ int runsvdir_main(int argc UNUSED_PARAM, unsigned now; unsigned stampcheck; int i; +#if ENABLE_FEATURE_RUNSVDIR_INIT + bool is_init = (1 == getpid()); +#else + enum { is_init = 0 }; +#endif INIT_G(); @@ -227,7 +246,12 @@ int runsvdir_main(int argc UNUSED_PARAM, set_pgrp = getopt32(argv, P); argv += optind; - bb_signals_recursive((1 SIGTERM) | (1 SIGHUP), record_signo); + bb_signals_recursive(0 + | (1 SIGTERM) + | (1 SIGHUP) + | (is_init ? ((1 SIGUSR1) | (1 SIGUSR2)) : 0) + , record_signo); + svdir = *argv++; #if ENABLE_FEATURE_RUNSVDIR_LOG @@ -341,16 +365,43 @@ run: } } #endif - switch (bb_got_signal) { - case SIGHUP: - for (i = 0; i svnum; i++) -if (sv[i].pid) - kill(sv[i].pid, SIGTERM); - // N.B. fall through - case SIGTERM: - _exit((SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS); + + if (bb_got_signal) { + // init signaled: kill all services + // runsvdir SIGHUP signaled: kill all services + if (SIGHUP == bb_got_signal || is_init) { +killall(SIGTERM); + } + // runsvdir: any signal causes exit + if (!is_init) +break; +#if ENABLE_FEATURE_RUNSVDIR_INIT + // N.B. we are here if we are init + + // SIGHUP means go on serving + if (SIGHUP == bb_got_signal) +bb_got_signal = 0; + + // SIGUSR1 means halt + // SIGUSR2 means poweroff + // SIGTERM means reboot + // let services exit cleanly + sync(); sleep(5); + // kill survived services + killall(SIGKILL); + sync(); sleep(1); + //
Re: Pending patches
On Wednesday 29 October 2008 03:48, Denys Vlasenko wrote: + // kill survived services + killall(SIGKILL); + sync(); sleep(1); + // umount -a + // ??? + // reboot + reboot( + (SIGUSR1 == bb_got_signal) ? RB_HALT_SYSTEM : + (SIGUSR2 == bb_got_signal) ? RB_POWER_OFF : + RB_AUTOBOOT + ); Do we really want to hard-reboot without remounting RO or unmounting filesystems? What if user disagrees with our reboot sequence (needs bigger delays, wants to free lop devices, etc?). Yes, he can code up different reboot himself, as a shell script. But this exact *our* sequence can be set up exactly the same way as a shell script, no need to patch runsvdir! Right? In order to facilitatte this, we just need to make sure we do not crash kernel by exiting. kernel doesn't like init exiting. How about attached patch? -- vda diff -d -urpN busybox.5/runit/runsvdir.c busybox.6/runit/runsvdir.c --- busybox.5/runit/runsvdir.c 2008-10-26 02:02:35.0 +0200 +++ busybox.6/runit/runsvdir.c 2008-10-29 04:27:00.0 +0100 @@ -115,10 +115,16 @@ static void runsv(int no, const char *na /* child */ if (set_pgrp) setsid(); +/* man execv: + * Signals set to be caught by the calling process image + * shall be set to the default action in the new process image. + * Therefore, we do not need this: */ +#if 0 bb_signals(0 - + (1 SIGHUP) - + (1 SIGTERM) + | (1 SIGHUP) + | (1 SIGTERM) , SIG_DFL); +#endif execvp(prog[0], prog); fatal2_cannot(start runsv , name); } @@ -227,7 +233,19 @@ int runsvdir_main(int argc UNUSED_PARAM, set_pgrp = getopt32(argv, P); argv += optind; - bb_signals_recursive((1 SIGTERM) | (1 SIGHUP), record_signo); + bb_signals(0 + | (1 SIGTERM) + | (1 SIGHUP) + /* For busybox's init, SIGTERM == reboot, + * SIGUSR1 == halt + * SIGUSR2 == poweroff + * so we need to intercept SIGUSRn too + * Note that we do not implement actual reboot + * (killall(TERM) + umount, etc), we just pause + * respawing and avoid exiting (- making kernel oops). + * The user is responsible for the rest. */ + | (getpid() == 1 ? ((1 SIGUSR1) | (1 SIGUSR2)) : 0) + , record_signo); svdir = *argv++; #if ENABLE_FEATURE_RUNSVDIR_LOG @@ -256,7 +274,7 @@ int runsvdir_main(int argc UNUSED_PARAM, rplog = NULL; warnx(log service disabled); } -run: + run: #endif curdir = open_read(.); if (curdir == -1) @@ -319,8 +337,9 @@ run: } pfd[0].revents = 0; #endif - sig_block(SIGCHLD); deadline = (need_rescan ? 1 : 5); + do_sleep: + sig_block(SIGCHLD); #if ENABLE_FEATURE_RUNSVDIR_LOG if (rplog) poll(pfd, 1, deadline*1000); @@ -333,11 +352,11 @@ run: if (pfd[0].revents POLLIN) { char ch; while (read(logpipe.rd, ch, 1) 0) { -if (ch) { - for (i = 6; i rploglen; i++) - rplog[i-1] = rplog[i]; - rplog[rploglen-1] = ch; -} +if (ch ' ') + ch = ' '; +for (i = 6; i rploglen; i++) + rplog[i-1] = rplog[i]; +rplog[rploglen-1] = ch; } } #endif @@ -346,11 +365,18 @@ run: for (i = 0; i svnum; i++) if (sv[i].pid) kill(sv[i].pid, SIGTERM); - // N.B. fall through + /* fall through */ case SIGTERM: - _exit((SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS); + /* exit, unless we are init */ + if (getpid() != 1) +break; + default: + /* so we are init. do not exit, + * and pause respawning - we may be rebooting... */ + bb_got_signal = 0; + deadline = 60; + goto do_sleep; } } - /* not reached */ - return 0; + return (SIGHUP == bb_got_signal) ? 111 : EXIT_SUCCESS; } ___ busybox mailing list busybox@busybox.net http://busybox.net/cgi-bin/mailman/listinfo/busybox