David Powell wrote: > First of all, please post questions like this on > smf-discuss at opensolaris.org. > OK. Did it.
I think I did not do a good job of explaining my dilemma. I figured from your explanation and removing "restart" method that "restart" method is not used at all. So, I removed it. Now, here are two situations, that I don't know how to deal with. 1- Manifest has a "start" method and a "stop" method. This works great with "svcadm enable" and "svcadm disable", it also helps me bringing up the service on system restart. Where this fails is when the started server process gets a SIGQUIT/SIGSEGV etc. In that case, SMF attempts to call "stop" method and since that fails, the process is NOT restarted. Note that my stop method is "robust". In fact, that's the only way for an administrator to gracefully stop my server process. But the underlying processes themselves don't start gracefully at times because of many factors. I simulated that using "kill -9 pid". This is causing a trouble for me because I was under the impression that SMF watchdogs my processes. Why does SMF call "stop" method when it figures out that service processes are killed? 2- Manifest only has a "start" method. [First of all, manpage is confusing for "svc.startd" because it says "stop" method is required. I don't think it is. I created a manifest with only "start" method and imported it into SMF without any problem.] This works great with "svcadm enable" and system reboot, but fails with "svcadm disable" obviously because disabling the service does not find the "stop" method and only changes the visible "state" of the service as shown by svcs command. Thus, whereas this watchdogs my process, does not help me if I want to gracefully shut the service down. So, if SMF wouldn't be calling "stop" method on finding out that the service or processes thereof have been terminated, then that would have served my purpose. Shouldn't "stop" be called only on "svcadm disable service-name"? Hope it is clearer now. > >>Now, consider a scenario with following pid's when the >>service is enabled for the first time: >> >>pid 100 (A) -> pid 101 (B). >>pid 100 exits normally as it is process A. >> >>Now, I kill process with pid 101 to see if SMF can really >>watchdog my server process. >> >>In the /var/svc/log I find a log message that says: "Received >>a fatal signal" and then SMF tries to invoke the exec_method >>named "stop". But that fails because the actual process has >>actually stopped. > > > Your stop method needs to handle the case where the service failed. > From svc.startd(1M): > > stop > > Stop the service. In some cases, the stop method can be > invoked when some or all of the service has already been > stopped. Only return an error if the service is not > entirely stopped on method return. > > This method is required. > > >>I then remove the section for "stop" and add a new exec_method >>named "restart" which is essentially the copy of "start" because >>that's what I want to do when my server process (pid 101, e.g.) >>is killed -- restart it. >> >>This works as I expected -> when process B is killed, it restarts. >> >>Thus, my manifest contains two exec_methods -- "start" and >>"restart" and no "stop". > > > There's no such thing as a "restart" method; this addition of yours > is never actually called. > > What's happening here is that SMF automatically restarts services > when they are killed. There is no need (or way) to describe how to > restart them. > > >>My question is -- is this normal? Is there a better way of >>doing it? > > > You should have a start and a stop method. If all you need to do to > stop your service is to kill it, you can just use the :kill keyword > instead of the name of a script as your stop method (see > smf_method(5)). Otherwise, you should supply a robust stop method > which can handle the case where components of your service are no > longer running. > > Dave >