Scott Dickson writes: > I have had several customers ask me in recent weeks the same question: > can I make SMF smarter in how it decides whether a service has failed? > A cluster agent can issue synthetic transactions and do some fairly > sophisticated monitoring to decide whether a service / app is still > alive. Is there an exit where I can insert my own test to see whether a > service has hung, for example? > > Is this something that is under consideration, or just something that I > have missed?
A planned feature that we haven't had the chance to implement yet. The concept was in our original design, but wasn't in the S10 release due to time constraints. Our efforts in S10 were around compatibility and simplicity of service model -- monitors shouldn't be required for a service to integrate with SMF (unlike in a HA cluster), but would be helpful to expand fault-detection into the application-specific realm. Once a first-pass design is written up, we'll post it in the SMF community. Expect it to be called "monitors". In the meantime, you could put together a new service which runs this sort of checking, and runs an appropriate svcadm command when the tests fail. It isn't particularly elegant, but would be pretty simple to do. The big thing is you'd want to take into account the target service's "state" and "next_state", and not send a bunch of restart commands if the service is offline, in maintenance, or in the middle of a transition. liane -- Liane Praza, Solaris Kernel Development liane.praza at sun.com - http://blogs.sun.com/lianep