[smf-discuss] Smarter testing in SMF?

lia...@eng.sun.com Wed, 17 May 2006 14:10:03 -0700

Scott Dickson writes:
> I have had several customers ask me in recent weeks the same question:  
> can I make SMF smarter in how it decides whether a service has failed?  
> A cluster agent can issue synthetic transactions and do some fairly 
> sophisticated monitoring to decide whether a service / app is still 
> alive.  Is there an exit where I can insert my own test to see whether a 
> service has hung, for example?
> 
> Is this something that is under consideration, or just something that I 
> have missed?


A planned feature that we haven't had the chance to implement yet.  
The concept was in our original design, but wasn't in the S10 release 
due to time constraints.  Our efforts in S10 were around compatibility 
and simplicity of service model -- monitors shouldn't be required for a 
service to integrate with SMF (unlike in a HA cluster), but would be 
helpful to expand fault-detection into the application-specific realm.

Once a first-pass design is written up, we'll post it in the SMF 
community.  Expect it to be called "monitors".

In the meantime, you could put together a new service which runs this 
sort of checking, and runs an appropriate svcadm command when the tests 
fail.  It isn't particularly elegant, but would be pretty simple to do.
The big thing is you'd want to take into account the target service's 
"state" and "next_state", and not send a bunch of restart commands if 
the service is offline, in maintenance, or in the middle of a 
transition.

liane
-- 
Liane Praza, Solaris Kernel Development
liane.praza at sun.com - http://blogs.sun.com/lianep

[smf-discuss] Smarter testing in SMF?

Reply via email to