Been thinking about this a bit more - I actually think
it's okay to just not ignore (i.e. set SIG_IGN) for
SIGCHLD signals. If I understand correctly, the
SIGCHLD signal is the mechanism by which a
parent waitpid() call is appraised of a child status
change (such as exit), so if the parent ignores it,
it can't get such  status changes, hence the errors
that are returned from waitpid(). I think all we
need to do is ensure that the ENM method scripts
are spawned from a separate thread to ensure that
the event-handling thread won't be stuck waiting forever for a
script to finish. We don't have that problem for SMF
enable/disable actions, since they just request the
enable/disable and don't run to completion.

Alan

Anurag S. Maskey wrote:
> A bit of investigation and I uncovered other things:
>
> After a new thread is spawned for the script, waitpid() is called (in 
> nwamd/util.c:197).  The return value is being ignored.  I checked and 
> the return value is -1 with errno set to ECHILD.  Because of this, 
> WEXITSTATUS() is non-zero.  Thus, even though the script ran and 
> exited successfully, start_child() was returning non-zero.
>
> The waitpid(2) man page says that if  SIGCHLD is set to SIG_IGN, 
> "waitpid() will fail and set errno to ECHILD".  I, then, discovered 
> that nwamd/main.c is setting SIGCHLD to SIG_IGN.  What is the purpose 
> of this?  If SIGCHLD is not set, then waitpid() completes 
> successfully, and WEXISTATUS() returns the actual return value from 
> the script.
>
> So, the question is:
>
> Does it make sense to check for errno == ECHILD if waitpid() returns 
> -1 and then return 0?
>
>    pid_t ret;
>    ...
>    ret = waitpid(pid, &status, 0);
>    if (ret == -1 && errno == ECHILD) {
>        return (0);
>    } else {
>        if (WIFSIGNALED(status) || WIFSTOPPED(status)) {
>        ...
>        }
>    }
>
> Is there a different workaround?
> Or, signal(SIGCHLD, SIG_IGN) is not needed for nwamd?
>
> Thanks,
> Anurag
>
>
>
> Anurag S. Maskey wrote:
>> Recall that last week, when we were testing enabling/disabling ENMs, 
>> I came across the situation where the log was saying the script 
>> "completed normally", but the return code was not 0 (it was 102 or 65 
>> or something like that).
>>
>> I've done a bit of digging and testing and have come to the 
>> conclusion that this return value of non-zero does not mean that the 
>> script failed.
>>
>>     bash-3.2# nwamcfg list enm myenm
>>     ENM:myenm
>>         activation-mode    manual
>>         stop          "/usr/bin/echo stop"
>>         enabled       true
>>         start         "/var/tmp/enm-script"
>>
>>     bash-3.2# cat /var/tmp/enm-script
>>     #!/sbin/sh
>>     echo "script run" > /var/tmp/echoed.txt
>>
>> Now, enable myenm
>>
>>     bash-3.2# nwamadm enable myenm
>>     Enabling enm 'myenm'
>>
>> check debug logs.  says "completed normally: 65".  65 is supposed to 
>> be the return value from the script (nwamd/util.c:143).  nwamd checks 
>> this return value and decides that the script failed.
>>
>>     bash-3.2# tail -f /var/tmp/nwam.log
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 853634 daemon.debug] 6:
>>     door_switch: activating myenm
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 545250 daemon.debug] 6:
>>     enqueueing event 17 (ENABLE) for object (80b2ac8) myenm
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 432327 daemon.debug] 2:
>>     dequeueing event of type 17 (ENABLE) for object myenm
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 394438 daemon.debug] 2:
>>     (80b2ac8) myenm: running method for event 17 (ENABLE)
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 995178 daemon.debug] 2:
>>     nwamd_enm_handle_enable_event: running script /var/tmp/enm-script
>>     for ENM myenm
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 653470 daemon.info] 2:
>>     '/usr/bin/ctrun /var/tmp/enm-script' completed normally: 65
>>     Feb  9 15:55:55 unknown nwamd[123354]: [ID 531482 daemon.error] 2:
>>     nwamd_enm_handle_enable_event: execution of '/var/tmp/enm-script'
>>     failed for ENM myenm
>>
>> But, check if script actually ran:
>>
>>     bash-3.2# cat /var/tmp/echoed.txt
>>     script run
>>
>> It actually did.
>>
>> What's going on here?
>>
>> Anurag
>>
>> PS. On a better news front, if an enm has an fmri, the 
>> enabling/disabling through nwamadm works spectacularly. :)
>> _______________________________________________
>> nwam-dev mailing list
>> nwam-dev at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/nwam-dev
>>   
> _______________________________________________
> nwam-dev mailing list
> nwam-dev at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/nwam-dev


Reply via email to