Peter Memishian wrote:
>  > This root cause presents one solution: Modify svc.startd to create
>  > restarter property groups on all services before starting any of them.
>  > This would provide the same SCF environment to early services as to late
>  > ones.  That is, service instances only lack restarter property groups
>  > when they're not configured properly or something else is wrong.
>  > I don't think this would cause a significant performance problem because
>  > creating restarter property groups shouldn't require disk I/O and should
>  > be relatively fast.  It's a nontrivial change to a crucial component,
>  > though, and certainly won't be done within the next few days.  For now,
>  > Clearview could use the shell script workaround Tony and Liane found.
>  > 
>  > An alternative solution, however, is to modify svcadm enable -ts to stop
>  > interpreting the restarter property group's absence as an error and
>  > instead just wait for it to appear.  Since it should be interpreted as
>  > an error after boot, but we don't have a good way to tell when that's
>  > the case, Tony and I think that for this situation svcadm should inform
>  > the user that the restarter property group is missing, that that's
>  > usually an error, and that it is waiting in case it isn't.  This would
>  > be a change in behavior for a relatively rare case, and could in theory
>  > be problematic if a program is invoking svcadm enable -ts without
>  > passing the message on to the user.  This fix should be pretty simple
>  > and might be doable with or before the Clearview putback.  I don't like
>  > the wishy-washy semantics, though.
>  > 
>  > So it seems that our choice is between an easy svcadm special change and
>  > an ugly Clearview workaround plus a more complicated startd fix later.
>  > I like the startd fix better in the long-term, but I'm not sure it's
>  > a slam-dunk.  Please opine.
> 
> I agree about the wishy-washy semantics of the second solution.  As far as
> expedient solutions go, I'd rather go with the shell script workaround in
> net-physical, as its blast radius is contained and its sins are obvious.

I agree.  Let's scrutinize it, though, to make sure it doesn't cause 
boot to hang indefinitely if something goes wrong.

> Longer term, would the startd solution still be necessary once the changes
> have been made to make manifest-import run earlier?

Theoretically, the problem could still exist.  However, UV (and problems 
of a similar category) won't run into it because it won't need the hack 
to make sure its dependencies are running on the first boot after upgrade.

I think a low/medium priority bug is in order for the startd change.  As 
always, I wish UV wasn't in the situation of having to do this at all, 
and would prefer to continue giving priority to the early-import effort 
so that future super-early-boot service changes can avoid having to do 
any work at all in this scenario.

liane

Reply via email to