On Wed, Feb 18, 2009 at 10:39:40PM -0500, Noah Meyerhans wrote:
> On Fri, Jan 23, 2009 at 03:34:37PM +0400, root wrote:
> > Initrc script for spamd uses start-stop-daemon with --exec option both for
> > starting and stopping. This option makes start-stop-daemon check the
> > /proc/PID/exe value. On my system spamd process have /proc/pid/exe pointing
> > to /usr/bin/perl, not to /usr/sbin/spamd, so spamd starts ok, but refuses to
> > stop saying a message: "No /usr/sbin/spamd found running; none killed."
> > Since spamd uses pid file, I think it will be trivial to remove --exec
> > parameter from "stop" and "reload" commands in /etc/init.d/spamassassin.
> > ("restart" already have no --exec option with --stop command)
>
> Hello,
>
> I'm not sure your diagnosis is correct. I am not able to reproduce the
> bug as you describe it. Spamd is properly terminated with the existing
> spamassassin init script. However, while investigating this, I have
> found an alternate possibility. spamd doesn't write its pid file until
> after it has initialized, which takes some time. The following log
> entries show that, on my system, a full 13 seconds elapse between the
> first two messages logged to syslog when spamd starts:
>
> Feb 18 22:35:40 insomnia spamd[29203]: logger: removing stderr method
> Feb 18 22:35:53 insomnia spamd[29205]: spamd: server started on port 783/tcp
> (running version 3.2.5)
> Feb 18 22:35:53 insomnia spamd[29205]: spamd: server pid: 29205
> Feb 18 22:35:53 insomnia spamd[29205]: spamd: server successfully spawned
> child process, pid 29213
> Feb 18 22:35:53 insomnia spamd[29205]: spamd: server successfully spawned
> child process, pid 29214
> Feb 18 22:35:53 insomnia spamd[29205]: prefork: child states: II
>
> So in this case, between 22:35:40 and 22:35:53 spamd is running, but has
> not yet written its pid file. If I run "/etc/init.d/spamassassin stop"
> during this time, it fails with the symptom that you described in your
> report. After 22:35:53, the pid file exists and
> /etc/init.d/spamassassin can correctly terminate spamd.
>
> Is it possible that the behavior you're describing is explained this
> way?
Sorry, you are not right, you are wrong.
Here is strace of /etc/init.d/spamassassin stop:
# strace -ff /etc/init.d/spamassassin stop 2> >(grep -E
"open|readlink|stat|exec") |cat > /tmp/strace
execve("/etc/init.d/spamassassin", ["/etc/init.d/spamassassin", "stop"], [/*
32 vars */]) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=140548, ...}) = 0
open("/lib/libncurses.so.5", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=202188, ...}) = 0
open("/lib/i686/cmov/libdl.so.2", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9680, ...}) = 0
open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0755, st_size=1413540, ...}) = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7d436b0, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}) = 0
open("/dev/tty", O_RDWR|O_NONBLOCK|O_LARGEFILE) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=8985872, ...}) = 0
open("/proc/meminfo", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
stat64("/mnt/pomoika/software/debian", {st_mode=S_IFDIR|0755, st_size=4096,
...}) = 0
stat64(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/etc/init.d/spamassassin", O_RDONLY|O_LARGEFILE) = 3
fstat64(255, {st_mode=S_IFREG|0755, st_size=1743, ...}) = 0
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=25700, ...}) = 0
open("/usr/lib/gconv/KOI8-R.so", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9472, ...}) = 0
stat64("/etc/default/spamassassin", {st_mode=S_IFREG|0644, st_size=1022,
...}) = 0
stat64("/etc/default/spamassassin", {st_mode=S_IFREG|0644, st_size=1022,
...}) = 0
open("/etc/default/spamassassin", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1022, ...}) = 0
stat64("/usr/sbin/spamd", {st_mode=S_IFREG|0755, st_size=102252, ...}) = 0
stat64(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/sbin/start-stop-daemon", {st_mode=S_IFREG|0755, st_size=18868, ...})
= 0
stat64("/sbin/start-stop-daemon", {st_mode=S_IFREG|0755, st_size=18868, ...})
= 0
[pid 25151] execve("/sbin/start-stop-daemon", ["start-stop-daemon", "--stop",
"--pidfile", "/var/run/spamd.pid", "--exec", "/usr/sbin/spamd", "--oknodo"],
[/* 32 vars */]) = 0
[pid 25151] open("/etc/ld.so.cache", O_RDONLY) = 3
[pid 25151] fstat64(3, {st_mode=S_IFREG|0644, st_size=140548, ...}) = 0
[pid 25151] open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3
[pid 25151] fstat64(3, {st_mode=S_IFREG|0755, st_size=1413540, ...}) = 0
[pid 25151] set_thread_area({entry_number:-1 -> 6, base_addr:0xb7e486b0,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,
seg_not_present:0, useable:1}) = 0
[pid 25151] stat64("/usr/sbin/spamd", {st_mode=S_IFREG|0755, st_size=102252,
...}) = 0
[pid 25151] open("/var/run/spamd.pid", O_RDONLY|O_LARGEFILE) = 3
[pid 25151] fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
[pid 25151] readlink("/proc/7082/exe", "/usr/bin/perl"..., 256) = 13
[pid 25151] stat64("/usr/bin/perl", {st_mode=S_IFREG|0755, st_size=1254016,
...}) = 0
[pid 25151] fstat64(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
Stopping SpamAssassin Mail Filter Daemon: No /usr/sbin/spamd found running;
none killed.
spamd.
Note, here start-stop-daemon were invoked with both --pidfile and --exec
arguments,
Contents of /var/run/spamd.pid is 7082, as you may guess from readlink() first
parameter,
yes, pid file is there and I checked that before and after strace.
Now we try to do the same, but without --exec argument to start-stop-daemon (I
just edited
/etc/init.d/spamassassin to remove --exec from stop action):
execve("/etc/init.d/spamassassin", ["/etc/init.d/spamassassin", "stop"], [/*
32 vars */]) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=140548, ...}) = 0
open("/lib/libncurses.so.5", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=202188, ...}) = 0
open("/lib/i686/cmov/libdl.so.2", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9680, ...}) = 0
open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0755, st_size=1413540, ...}) = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7ddd6b0, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}) = 0
open("/dev/tty", O_RDWR|O_NONBLOCK|O_LARGEFILE) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=8985872, ...}) = 0
open("/proc/meminfo", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
stat64("/mnt/pomoika/software/debian", {st_mode=S_IFDIR|0755, st_size=4096,
...}) = 0
stat64(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/etc/init.d/spamassassin", O_RDONLY|O_LARGEFILE) = 3
fstat64(255, {st_mode=S_IFREG|0755, st_size=1727, ...}) = 0
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=25700, ...}) = 0
open("/usr/lib/gconv/KOI8-R.so", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9472, ...}) = 0
stat64("/etc/default/spamassassin", {st_mode=S_IFREG|0644, st_size=1022,
...}) = 0
stat64("/etc/default/spamassassin", {st_mode=S_IFREG|0644, st_size=1022,
...}) = 0
open("/etc/default/spamassassin", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1022, ...}) = 0
stat64("/usr/sbin/spamd", {st_mode=S_IFREG|0755, st_size=102252, ...}) = 0
stat64(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/sbin/start-stop-daemon", {st_mode=S_IFREG|0755, st_size=18868, ...})
= 0
stat64("/sbin/start-stop-daemon", {st_mode=S_IFREG|0755, st_size=18868, ...})
= 0
[pid 25227] execve("/sbin/start-stop-daemon", ["start-stop-daemon", "--stop",
"--pidfile", "/var/run/spamd.pid", "--oknodo"], [/* 32 vars */]) = 0
[pid 25227] open("/etc/ld.so.cache", O_RDONLY) = 3
[pid 25227] fstat64(3, {st_mode=S_IFREG|0644, st_size=140548, ...}) = 0
[pid 25227] open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3
[pid 25227] fstat64(3, {st_mode=S_IFREG|0755, st_size=1413540, ...}) = 0
[pid 25227] set_thread_area({entry_number:-1 -> 6, base_addr:0xb7e246b0,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,
seg_not_present:0, useable:1}) = 0
[pid 25227] open("/var/run/spamd.pid", O_RDONLY|O_LARGEFILE) = 3
[pid 25227] fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
[pid 25227] kill(7082, SIGTERM) = 0
Stopping SpamAssassin Mail Filter Daemon: spamd.
It is dead now, the same process, as you see.
No readlink() happens now, but kill() does.
What is the correct diagnosis?
According to man start-stop-daemon:
-x, --exec executable
Check for processes that are instances of this executable
(according to /proc/pid/exe).
Then, see ls -la /proc/`cat /var/run/spamd.pid`/exe:
lrwxrwxrwx 1 root root 0 Feb 27 14:01 /proc/25231/exe -> /usr/bin/perl
But what do you have? Is it not a start-stop-daemon problem?
So, we can do one of the following:
* change /proc/pid/exe (patch the application)
* remove --exec from stop action from script
* change --exec value to be '/usr/bin/perl'
* ignore the issue, since all processes are killed with TERM and then with
KILL on shutdown anyway...
PS: By the way, does anybody know why start-stop-daemon does not detach it's
process from tty?
Try to issue SAK on tty1 (where all boot process happened), you will see a
lot of processes died,
if you have enough services, that do not detach by themselves.
--
Shpagin Alexey
System Administrator.
--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]