[smf-discuss] ?: what determines a restart too many times

Steffen Weiberle Wed, 28 Jun 2006 10:52:44 -0400

Renaud Manus wrote On 06/28/06 09:58,:
> 
> 
> Steffen Weiberle wrote:
> 
>> I've been looking all over and can't find this...
>>
>> In the past I have seen error messages when a service terminates too 
>> often that says service restarted too many times and has been put into 
>> maintenance mode. I even thought I had seen a parameter somewhere that 
>> set it to 3 or so. Howver, now I can't find it.
> 
> 
> This is hardcoded into usr/src/cmd/svc/startd/startd.h
> 
> #define FAULT_THRESHOLD         3
> 
>>
>> The reason is that I have come across a situation/bug where vold 
>> (volfs service) core dumps in 6/06. Vold is restarted every four to 
>> six seconds, so I am getting about ten abnormal ends and resulting 
>> core dumps a minute! Took me a while to figure out why my disk would 
>> not stop :)
>>
>> Anyway, it seems SMF should catch the fact that this error is 
>> happening much to frequently and put volfs into maintenance mode. But 
>> it is not. So I have been searching for the variable, but can't find 
>> it. I don't see it in the manifests for the services I have looked 
>> at.  Where is it?
> 
> 
> svc.startd(1M) man page says:
> 
>       If three method failures happen in a row, or if the  service
>       is restarting more than once a second, svc.startd places the
>       service in the maintenance state.


Now I overlooked that!

> 
> It seems that in your case, the start method doesn't fail but the
> service is restarting every four or six seconds hence the rules don't
> apply and the instance never goes into maintenance.

As I was trawling through messages on this topic I did see some references to 
why/why not put a 
service into maintenance. One comment was that it the same error happens over 
and over, what is the 
purpose of retrying. This seems like a perfect case where retries should be 
limited.

Off-hand, is this type of issue being addressed via a CR for bug or RFE? If 
not, I can file one.

Thanks
Steffen

> 
> -- Renaud

[smf-discuss] ?: what determines a restart too many times

Reply via email to