[smf-discuss] Restart after SMF start timeout

Nicolas Williams Fri, 18 Apr 2008 15:42:22 -0500

On Fri, Apr 18, 2008 at 08:11:23PM +0100, Neil Garthwaite wrote:
> > Though, as an aside, I am interested in how this works in real  
> > life.  If the transient service fails, and enters maintenance, what  
> > will the administrator do differently than your stop method script  
> > to clean up so that they don't have to reboot to repair the service?
> 
> Well, I hope this doesn't appear too messy.
> 
> I'm using SMF with Sun Cluster. In particular, in SC we have an agent  
> that can failover non-global zones (S10 native as well as S8 and lx  
> branded zones) between SC nodes. In addition to failing over the non- 
> global zone, if that non-global zone is a S10 zone then we can also  
> enable/disable an SMF service within the "failover" zone.
> 
> In this regard, SC manages the SMF enable/disable and additionally  
> probes the application that was started by the SMF service. This then  
> allows the probe to determine wedged applications or a bad application  
> and signal back to SC to either perform a local restart or initiate a  
> failover to another SC node.


Sounds like you need to have either your own restarter, or notification
APIs by which to monitor service state and react accordingly.  (We've
had a debate on this list in recent months about notification APIs.)

Nico
--

[smf-discuss] Restart after SMF start timeout

Reply via email to