Liane Praza wrote:
> Peter Memishian wrote:
>>  > This root cause presents one solution: Modify svc.startd to create
>>  > restarter property groups on all services before starting any of them.
>>  > This would provide the same SCF environment to early services as to late
>>  > ones.  That is, service instances only lack restarter property groups
>>  > when they're not configured properly or something else is wrong.
>>  > I don't think this would cause a significant performance problem because
>>  > creating restarter property groups shouldn't require disk I/O and should
>>  > be relatively fast.  It's a nontrivial change to a crucial component,
>>  > though, and certainly won't be done within the next few days.  For now,
>>  > Clearview could use the shell script workaround Tony and Liane found.
>>  > 
>>  > An alternative solution, however, is to modify svcadm enable -ts to stop
>>  > interpreting the restarter property group's absence as an error and
>>  > instead just wait for it to appear.  Since it should be interpreted as
>>  > an error after boot, but we don't have a good way to tell when that's
>>  > the case, Tony and I think that for this situation svcadm should inform
>>  > the user that the restarter property group is missing, that that's
>>  > usually an error, and that it is waiting in case it isn't.  This would
>>  > be a change in behavior for a relatively rare case, and could in theory
>>  > be problematic if a program is invoking svcadm enable -ts without
>>  > passing the message on to the user.  This fix should be pretty simple
>>  > and might be doable with or before the Clearview putback.  I don't like
>>  > the wishy-washy semantics, though.
>>  > 
>>  > So it seems that our choice is between an easy svcadm special change and
>>  > an ugly Clearview workaround plus a more complicated startd fix later.
>>  > I like the startd fix better in the long-term, but I'm not sure it's
>>  > a slam-dunk.  Please opine.
>>
>> I agree about the wishy-washy semantics of the second solution.  As far as
>> expedient solutions go, I'd rather go with the shell script workaround in
>> net-physical, as its blast radius is contained and its sins are obvious.
> 
> I agree.  Let's scrutinize it, though, to make sure it doesn't cause 
> boot to hang indefinitely if something goes wrong.
> 
>> Longer term, would the startd solution still be necessary once the changes
>> have been made to make manifest-import run earlier?
> 
> Theoretically, the problem could still exist.  However, UV (and problems 
> of a similar category) won't run into it because it won't need the hack 
> to make sure its dependencies are running on the first boot after upgrade.
> 
> I think a low/medium priority bug is in order for the startd change.  As 
> always, I wish UV wasn't in the situation of having to do this at all, 
> and would prefer to continue giving priority to the early-import effort 
> so that future super-early-boot service changes can avoid having to do 
> any work at all in this scenario.
> 

I've filed

6653687 svcadm's wait_fmri_enabled failed prematurely because 
"restarter" pg isn't yet available

to keep track of this issue.

-tony

Reply via email to