A bit of investigation and I uncovered other things:

After a new thread is spawned for the script, waitpid() is called (in 
nwamd/util.c:197).  The return value is being ignored.  I checked and 
the return value is -1 with errno set to ECHILD.  Because of this, 
WEXITSTATUS() is non-zero.  Thus, even though the script ran and exited 
successfully, start_child() was returning non-zero.

The waitpid(2) man page says that if  SIGCHLD is set to SIG_IGN, 
"waitpid() will fail and set errno to ECHILD".  I, then, discovered that 
nwamd/main.c is setting SIGCHLD to SIG_IGN.  What is the purpose of 
this?  If SIGCHLD is not set, then waitpid() completes successfully, and 
WEXISTATUS() returns the actual return value from the script.

So, the question is:

Does it make sense to check for errno == ECHILD if waitpid() returns -1 
and then return 0?

    pid_t ret;
    ...
    ret = waitpid(pid, &status, 0);
    if (ret == -1 && errno == ECHILD) {
        return (0);
    } else {
        if (WIFSIGNALED(status) || WIFSTOPPED(status)) {
        ...
        }
    }

Is there a different workaround?
Or, signal(SIGCHLD, SIG_IGN) is not needed for nwamd?

Thanks,
Anurag



Anurag S. Maskey wrote:
> Recall that last week, when we were testing enabling/disabling ENMs, I 
> came across the situation where the log was saying the script "completed 
> normally", but the return code was not 0 (it was 102 or 65 or something 
> like that).
>
> I've done a bit of digging and testing and have come to the conclusion 
> that this return value of non-zero does not mean that the script failed.
>
>     bash-3.2# nwamcfg list enm myenm
>     ENM:myenm
>         activation-mode    manual
>         stop          "/usr/bin/echo stop"
>         enabled       true
>         start         "/var/tmp/enm-script"
>
>     bash-3.2# cat /var/tmp/enm-script
>     #!/sbin/sh
>     echo "script run" > /var/tmp/echoed.txt
>
> Now, enable myenm
>
>     bash-3.2# nwamadm enable myenm
>     Enabling enm 'myenm'
>
> check debug logs.  says "completed normally: 65".  65 is supposed to be 
> the return value from the script (nwamd/util.c:143).  nwamd checks this 
> return value and decides that the script failed.
>
>     bash-3.2# tail -f /var/tmp/nwam.log
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 853634 daemon.debug] 6:
>     door_switch: activating myenm
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 545250 daemon.debug] 6:
>     enqueueing event 17 (ENABLE) for object (80b2ac8) myenm
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 432327 daemon.debug] 2:
>     dequeueing event of type 17 (ENABLE) for object myenm
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 394438 daemon.debug] 2:
>     (80b2ac8) myenm: running method for event 17 (ENABLE)
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 995178 daemon.debug] 2:
>     nwamd_enm_handle_enable_event: running script /var/tmp/enm-script
>     for ENM myenm
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 653470 daemon.info] 2:
>     '/usr/bin/ctrun /var/tmp/enm-script' completed normally: 65
>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 531482 daemon.error] 2:
>     nwamd_enm_handle_enable_event: execution of '/var/tmp/enm-script'
>     failed for ENM myenm
>
> But, check if script actually ran:
>
>     bash-3.2# cat /var/tmp/echoed.txt
>     script run
>
> It actually did.
>
> What's going on here?
>
> Anurag
>
> PS. On a better news front, if an enm has an fmri, the 
> enabling/disabling through nwamadm works spectacularly. :)
> _______________________________________________
> nwam-dev mailing list
> nwam-dev at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/nwam-dev
>   

Reply via email to