I can't find an answer whether it is possible for an SMF service to run 
automated
health checks (as defined by the service script's author) and restart if 
required.

For a specific example, we run Magnolia CMS in a Tomcat server in a zone. 
It depends on a MySQL server to work properly. This server lives in another 
zone 
(usually on another server, in fact). One of the problem-scenarios is as 
follows: 
* if the MySQL server is not running, the Magnolia web-app is not initialized; 
* if the MySQL server was restarted, the web-app's connection breaks. 
In either case, Tomcat is running (SMF is happy - its contract service is 
fulfilled), 
but the end-user service is no longer provided. To fix the problem the web-app 
or 
the whole web-container need to be restarted.

Other scenarios with different web-applications involve running out of memory or
slowing down then crawling to death. In any case, as soon as the end-user 
service
goes below a SLA the service should be recycled - it is known to help. (say, 
the 
web-site's page takes over 5s to render, or some runaway loop consumes 95+% 
CPU for many minutes). For most third-party applications, we can't fix them 
directly (i.e. rewrite to prevent them from failing), but we have to do our 
best to 
reduce downtime for the customers.

We do currently have some scripts to run such checks and maintain our services,
so their logic (and to some extent implementation) is not the problem. These
scripts are placed into root's (or webserver user's) crontab and invoke init 
scripts
to recycle services.

I want to convert these scripts and crontabs into a single SMF service which 
includes complicated self-monitoring, to reduce the complexity of (default) 
configuration as well as improve observability. I want to stress again that a
working contract in the OS is not the only metric which describes a service
as truly "online".

Are there some established best-practices and examples, or am I doomed to 
invent another bicycle? ;)

//Jim
-- 
This message posted from opensolaris.org

Reply via email to