On Thu, Feb 3, 2011 at 12:04 PM, Andy Gibbs <[email protected]> wrote:
> On Thursday, February 03, 2011 4:11 AM, Denys Vlasenko wrote:
>
>> run "strace -s99 -oLOG -p <pid of udhcpc>" for some time while it is
>> in discover loop and is creating zombies, then ^C it and post
>> resulting LOG file.
>
> Denys,
>
> Thank you for your reply.
>
> Attached to this email then are two log files: the first is from the initial
> application called from command line; the second is from the respawned
> application.  I ensured that at least one zombie is created during the
> creation of the second log, so hopefully it contains the data you need.

This part looks suspicious:

vfork()                                 = 14821
--- SIGCHLD (Child exited) @ 0 (0) ---
write(1, "No lease, forking to background\n", 32) = 32

I comes from this C code fragment:

 leasefail:
                                udhcp_run_script(NULL, "leasefail");
#if BB_MMU /* -b is not supported on NOMMU */
                                if (opt & OPT_b) { /* background if no lease */
                                        bb_info_msg("No lease, forking
to background");

udhcp_run_script looks like this:

static void udhcp_run_script(struct dhcp_packet *packet, const char *name)
{
...
        argv[0] = (char*) client_config.script;
        argv[1] = (char*) name;
        argv[2] = NULL;
        spawn_and_wait(argv);

and spawn_and_wait:

int FAST_FUNC spawn_and_wait(char **argv)
{
        int rc;
#if ENABLE_FEATURE_PREFER_APPLETS
        int a = find_applet_by_name(argv[0]);
        if (a >= 0 && (APPLET_IS_NOFORK(a) {...}
#endif /* FEATURE_PREFER_APPLETS */
        rc = spawn(argv);
        return wait4pid(rc);
}

and spawn()... bingo. I see the bug:

pid_t FAST_FUNC spawn(char **argv)
{
        volatile int failed = 0;
        pid = vfork();
        if (pid < 0) /* error */
                return pid;
        if (!pid) { /* child */
                BB_EXECVP(argv[0], argv);
                failed = errno;
                _exit(111);
        }
        if (failed) {
                errno = failed;
                return -1;   <===== we do not wait for child!
        }
        return pid;
}

This only happens if exec fails.

I want to confirm that bug indeed happens exactly here.
Can you run "strace -tt -f -s99 -oLOG -p <pid of udhcpc>"
(different options for strace)?

-- 
vda
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to