On Tue, May 24, 2011 at 02:44:14PM +1200, Tim Foster wrote:
> Hi Ed,
>
> On Mon, 2011-05-23 at 19:07 -0700, Edward Pilatowicz wrote:
>
> [ snip stuff Tim wrote about zones-proxy-client instances timing out &
> dropping to maintenance if the system-repository service drops to
> maintenance ]
>
> > > > If not, despite fixing the problem in the GZ, we need to clear the
> > > > maintenance state of each zone-proxy-client (one per zone) manually. I
> > > > think setting an infinite timeout would be better for this client
> > > > service, but would welcome comments.
> > > >
> > >
> > > so the cascade effects are pretty unfortunate. perhaps we can have a
> > > follow up fix where pkg set-publisher checks if the sysrepo is running,
> > > and if so refuses to configure unsupported file repositories?
> > >
> >
> > so, i just looked at zoneproxyd.xml and noticed:
> >
> > ---8<---
> > <dependency
> > name='sysrepo'
> > type='service'
> > grouping='require_all'
> > restart_on='restart'>
> > <service_fmri
> > value='svc:/application/pkg/system-repository' />
> > </dependency>
> > ---8<---
> >
> > why are we using restart_on='restart'?
>
> I think because zones-proxyd pulls configuration from the
> system-repository SMF service, in particular, which port to open
> connections to. If system-repository changes its port, and restarts,
> we need to cause zones-proxyd to do the same thing. [ as admittedly
> unlikely as that might happen ]
>
i talked about this with krister last week and the conclusion we reached
was that if the sysrepo port was changed and the service was refreshed,
any outstanding connections would get dropped, and then to make sure new
connections went to the right port we updated the sysrepo refresh
service to do:
pkill -USR1 -ox zoneproxyd
so when the zoneproxyd process gets a USR1 signal it will re-read the
port configuration.
> > why not just change this to
> > restart_on='none'? if the sysrepo service is broken or not responding,
> > clients will be unable to connect to it so they should receive pretty
> > quick notification via tcp connection failures. that way the proxy
> > service can continue to run uninterrupted.
>
> Good idea!
>
> Modulo the above scenario, I tried this (configuring an unsupported file
> repo, then watching the system-repository service drop to maintenance)
>
> Zones-proxyd stayed online in the global zone, zones continued to have
> their zones-proxy-client services online, but some packaging operations
> were impacted by the sysrepo service being offline (looks like apache
> will keep serving proxy requests, but not accesses to file repositories,
> or any requests for catalogs - which itself is severe enough to stop a
> lot of pkg operations)
>
in this case perhaps we should update the refresh method to kill off all
the other processes in the system repository service contract before
returning an error. that way all accesses to the sysrepo service will
fail and we'll get consistent errors. (vs some http working and some
file failing, etc.)
ed
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss