[smf-discuss] Smarter testing in SMF?

Richard Elling Wed, 17 May 2006 15:53:53 -0700

On Wed, 2006-05-17 at 16:48 -0500, Nicolas Williams wrote:
> On Wed, May 17, 2006 at 02:10:03PM -0700, lianep at eng.sun.com wrote:
> > In the meantime, you could put together a new service which runs this 
> > sort of checking, and runs an appropriate svcadm command when the tests 
> > fail.  It isn't particularly elegant, but would be pretty simple to do.
> > The big thing is you'd want to take into account the target service's 
> > "state" and "next_state", and not send a bunch of restart commands if 
> > the service is offline, in maintenance, or in the middle of a 
> > transition.
> 
> Or perhaps you could fire off a monitor from the start method of the
> actual service to be monitored using ctrun to run the monitor in its own
> process contract and restartably.  This avoids having a separate SMF
> service polluting the SMF service namespace.


This can get a bit complicated.  Suppose FMA kills the monitor
contract and the monitor loses its state of the monitored service.
For simple monitors, such as "does the process exist," this won't
be a problem.  For a monitor which is making a database transaction,
then there needs to be enough smarts in the monitor to cancel an
in-flight transactions which might interfere with its analysis of
the database health.  It is not clear to me that stateless monitors 
will be more useful than the current method, so it might be somewhat
complex to write a good monitor.

Monitors also tend to have timeouts, which further complicates their
deployment.  It is not clear to me that we can avoid following the
current path of cluster monitors, even as they get more complicated
(eg. dynamically adjustable timeouts).  It might be better just to
implement a single-node cluster instead, when possible, thus 
leveraging the existing agents.
 -- richard

[smf-discuss] Smarter testing in SMF?

Reply via email to