http://defect.opensolaris.org/bz/show_bug.cgi?id=12567


amaguire <alan.maguire at sun.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ACCEPTED                    |CAUSEKNOWN


--- Comment #10 from amaguire <alan.maguire at sun.com> 2009-11-11 11:56:25 UTC 
---
The pstack is hung up on nscd door calls. Investigating a bit further wrt nscd,
I think it really needs to depend with restart_on='refresh' on
physical:nwam/physical:default. It actually has an indirect dependency on these
via filesystem/minimal -> filesystem/user -> boot-archive -> filesystem/root ->
metainit -> identity:node, and the consequences of all that are, I suspect that
things happen in the right order during boot, but not after, so when nwam
refreshes as part of NCP switch, it gets confused.

So a part of the fix may be to give nscd a kick when nwam refreshes as part of
NCP switch with restart_on = 'refresh' dependencies, i.e. add:

        <!-- nscd needs to be informed of refresh/restart of net services -->
        <dependency
                name='physical'
                grouping='require_any'
                restart_on='refresh'
                type='service'>
                <service_fmri value='svc:/network/physical:default' />
                <service_fmri value='svc:/network/physical:nwam' />
        </dependency>


Doing this gets us out of the startd mess and the NCP switch works fine, but
another problem is lurking - nis/client is going into maintenance. I think
what's happening with nis/client is this - as part of location enable we
disable it and then reenable the service. The disable sends a SIGTERM as part
of the :kill stop method, but perhaps while it's shutting down, the new ypbind
starts, sees the dying one running and exits? I think we want the process
contract for ypbind to be empty before it's considered offline, and all :kill
guarantees is that the all procs in the contract get the signal. I think what
we want is a stop method script that uses smf_kill_contract instead to be on
the safe side - calling smf_kill_contract $CTID TERM 1 30, say. We can get the
CTID by passing %{restarter/contract} in as part of the stop/exec property. So
we need to add a stop method to yp. A few complications - the yp script is
shared by nis client, server, passwd services, and is also invoked to create
ipf rules as part of the ipfilter stuff. So we need to tread carefully here.

-- 
Configure bugmail: http://defect.opensolaris.org/bz/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Reply via email to