On Thu, 18 Jun 2009, John Summerfield wrote:

Jon Peatfield wrote:



 BTW the default reboot/shutdown procedures in el5/sl5 don't give user
 processes very long to checkpoint themselves, and I *think* that
networking may have been turned off by the time they get signalled. We

That's too silly for words. How likely is it that users, somewhere, will have open files on NFS mounts?

100% where home is on an NFS mount. I don't think any distro would be shutting down networks that soon.

For the case of reboot the directory /etc/rc.d/rc6.d/ contains the relevant Kxx* scripts which are run followed by the Sxx* ones.

The set I see on sl53 (and I assume el53) includes K90network, S00killall (which despite the name doesn't kill 'all'), and S01reboot - the last of these is the only one which appears to signal all *user* processes, e.g.

 ...
 action $"Sending all processes the TERM signal..." /sbin/killall5 -15
 sleep 5
 action $"Sending all processes the KILL signal..."  /sbin/killall5 -9
 ...

So unless / is on NFS the networking is likely to already be down by the time that the signals are sent to all user jobs.

Now of course if a process is connected to a tty then that will probably get a signal at the point that getty or sshd or X (or whatever) get killed, but long-running jobs probably arn't attached to a tty...

I may have managed to miss another place where user jobs get signalled, or perhaps I broke something which would do it...

 -- Jon

Reply via email to