Dave Challis wrote: > I'm running a custom compiled Apache 2 which is managed by SMF. The manifest > and method files I'm using are copies of the files shipped with Solaris 10 > (network/http-apache2), except that they have a different path to apache's > files. > > I'm currently having two problems though: > 1. Periodically (I'm still looking into why this is happening), apache dies, > leaving behind a defunct process. SMF sees that the apache process isn't > running, so tries to restart it. It seems to be unable to kill the defunct > process as part of its stop method though. > > Looking in /var/svc/log/network-http-CSWapache2, there were the messages: > [ Nov 22 03:10:03 Stopping because service restarting. ] > [ Nov 22 03:10:03 Executing stop method ("/lib/svc/method/http-CSWapache2 > stop") ] > [ Nov 22 03:10:04 Method "stop" exited with status 0 ] > [ Nov 22 03:11:04 Method or service exit timed out. Killing contract 108 ] > [ Nov 22 03:11:05 Method or service exit timed out. Killing contract 108 ] > [ Nov 22 03:11:06 Method or service exit timed out. Killing contract 108 ] > This message was repeated every second in the log for several hours. > > Looking at the output of 'ctstat -i 108 -v': > CTID ZONEID TYPE STATE HOLDER EVENTS QTIME NTIME > 108 0 process owned 7 0 - - > cookie: 0x20 > informative event set: none > critical event set: core signal hwerr empty > fatal event set: none > parameter set: inherit regent > member processes: 3158 > inherited contracts: none > > > Using the 'ps' command then showed process 3158 as a defunct apache process, > which I'm guessing SMF wasn't able to kill. > > After removing the defunct process with 'preap 3158', then the messages > about contract 108 in the log files stopped. > > Is there any way to avoid this requiring manual intervention if it happens > again? > > > 2. After fixing the problem mentioned above, I'm unable to clear the > maintenance state on this service. > > I manually set the maintenance state using: > bash-3.00# svcadm -v mark maintenance http-CSWapache2 > Action maint_on set for svc:/network/http-CSWapache2:CSWapache2. > > I then tried to clear this state using: > bash-3.00# svcadm -v clear http-CSWapache2 > Action maint_off set for svc:/network/http-CSWapache2:CSWapache2. > > However, if I use svcs to report on the state of this service, it still > reports it as in maintenance: > bash-3.00# svcs -xv http-CSWapache2 > svc:/network/http-CSWapache2:CSWapache2 (Apache 2 HTTP server) > State: maintenance since 22 November 2007 12:28:35 GMT > Reason: Maintenance requested by an administrator. > See: http://sun.com/msg/SMF-8000-63 > See: man -M /opt/csw/apache2/man -s 8 httpd > Impact: This service is not running. > > How can I force svcadm to clear this state? > I don't have an explanation for what you're seeing. Disable and enable the service should clear maintenance state though we will need to understand the cause.
-tony