http://defect.opensolaris.org/bz/show_bug.cgi?id=12567
--- Comment #11 from amaguire <alan.maguire at sun.com> 2009-11-11 12:26:46 UTC
---
(In reply to comment #10)
> The pstack is hung up on nscd door calls. Investigating a bit further wrt
> nscd,
> I think it really needs to depend with restart_on='refresh' on
> physical:nwam/physical:default. It actually has an indirect dependency on
> these
> via filesystem/minimal -> filesystem/user -> boot-archive -> filesystem/root
> ->
> metainit -> identity:node, and the consequences of all that are, I suspect
> that
> things happen in the right order during boot, but not after, so when nwam
> refreshes as part of NCP switch, it gets confused.
>
> So a part of the fix may be to give nscd a kick when nwam refreshes as part of
> NCP switch with restart_on = 'refresh' dependencies, i.e. add:
>
> <!-- nscd needs to be informed of refresh/restart of net services -->
> <dependency
> name='physical'
> grouping='require_any'
> restart_on='refresh'
> type='service'>
> <service_fmri value='svc:/network/physical:default' />
> <service_fmri value='svc:/network/physical:nwam' />
> </dependency>
>
>
> Doing this gets us out of the startd mess and the NCP switch works fine, but
> another problem is lurking - nis/client is going into maintenance. I think
> what's happening with nis/client is this - as part of location enable we
> disable it and then reenable the service. The disable sends a SIGTERM as part
> of the :kill stop method, but perhaps while it's shutting down, the new ypbind
> starts, sees the dying one running and exits? I think we want the process
> contract for ypbind to be empty before it's considered offline, and all :kill
> guarantees is that the all procs in the contract get the signal. I think what
> we want is a stop method script that uses smf_kill_contract instead to be on
> the safe side - calling smf_kill_contract $CTID TERM 1 30, say. We can get the
> CTID by passing %{restarter/contract} in as part of the stop/exec property. So
> we need to add a stop method to yp. A few complications - the yp script is
> shared by nis client, server, passwd services, and is also invoked to create
> ipf rules as part of the ipfilter stuff. So we need to tread carefully here.
Actually the name-service-cache dependencies on physical:[nwam|default] are
only part of the story. Important also is that we haven't defined a project for
the netcfg/netadm user. When we call getdefaultprojid(), it does an initial
lookup of the default project in /etc/user_attr, so we need to add the
"project=default" attribute for the netcfg/netadm users. Since we hadn't, we
were falling back to the installed nameservices and getting hung up.
With a fixed /etc/user_attr and the name-service-cache dependencies, NCP
switching works fine (both are needed - I tried using the fixed user_attr
without the name-service-cache dependency - it didn't work), but we still get
the nis/client in maintenance issue.
--
Configure bugmail: http://defect.opensolaris.org/bz/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.