Hi Ed,

On Mon, 2011-05-23 at 19:07 -0700, Edward Pilatowicz wrote:

[ snip stuff Tim wrote about zones-proxy-client instances timing out &
dropping to maintenance if the system-repository service drops to
maintenance ]

> > > If not, despite fixing the problem in the GZ, we need to clear the
> > > maintenance state of each zone-proxy-client (one per zone) manually.  I
> > > think setting an infinite timeout would be better for this client
> > > service, but would welcome comments.
> > >
> >
> > so the cascade effects are pretty unfortunate.  perhaps we can have a
> > follow up fix where pkg set-publisher checks if the sysrepo is running,
> > and if so refuses to configure unsupported file repositories?
> >
> 
> so, i just looked at zoneproxyd.xml and noticed:
> 
> ---8<---
>         <dependency
>                   name='sysrepo'
>                   type='service'
>                   grouping='require_all'
>                   restart_on='restart'>
>                   <service_fmri 
> value='svc:/application/pkg/system-repository' />
>         </dependency>
> ---8<---
> 
> why are we using restart_on='restart'?

I think because zones-proxyd pulls configuration from the
system-repository SMF service, in particular, which port to open
connections to.   If system-repository changes its port, and restarts,
we need to cause zones-proxyd to do the same thing.  [ as admittedly
unlikely as that might happen ]

>   why not just change this to
> restart_on='none'?  if the sysrepo service is broken or not responding,
> clients will be unable to connect to it so they should receive pretty
> quick notification via tcp connection failures.  that way the proxy
> service can continue to run uninterrupted.

Good idea!

Modulo the above scenario, I tried this (configuring an unsupported file
repo, then watching the system-repository service drop to maintenance)

Zones-proxyd stayed online in the global zone, zones continued to have
their zones-proxy-client services online,  but some packaging operations
were impacted by the sysrepo service being offline (looks like apache
will keep serving proxy requests, but not accesses to file repositories,
or any requests for catalogs - which itself is severe enough to stop a
lot of pkg operations)

That seems a lot better than the failure mode we had before, taking the
tradeoff of not restarting zones-proxyd automatically if a user ever
decides to change the port rather than needing to manually clear
services on all zones on the system.

I'll file a bug capturing this.

        cheers,
                        tim


_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to