Hello,
My system has been affected by this for *ages*, though only today
did I find out the exact nature of the problem. Let me explain:
After my computer had been turned on for a week or so, this morning
cron suddenly complained about daemons failing to restart:
/etc/cron.daily/logrotate:
acpid: can't open /proc/acpi/event: Device or resource busy
error: error running postrotate script for /var/log/acpid
error: error running postrotate script for /var/log/icecast2/error.log
Restarting filtering proxy server: invoke-rc.d: initscript privoxy, action
"restart" failed.
error: error running shared postrotate script for /var/log/privoxy/logfile
/var/log/privoxy/jarfile /var/log/privoxy/errorfile
run-parts: /etc/cron.daily/logrotate exited with return code 1
/etc/cron.daily/man-db:
mandb: warning: /usr/share/man/man1/gsmsiectl.1 is a dangling symlink
mandb: warning: /usr/share/man/man3/open_memstream.3.gz is a dangling symlink
The same thing also happens with syslogd, which ends up (after the
log rotation) writing to /var/log/foo.log.0 instead of foo.log. So, I
traced the nature of the problem and quickly found that s-s-d was to
blame:
# ps ax | egrep acpi
9 ? S< 0:00 [kacpid]
1894 ? Ss 0:00 /usr/sbin/acpid -c /etc/acpi/events -s
/var/run/acpid.socket
30532 pts/3 S+ 0:00 grep -E acpi
# start-stop-daemon --stop --exec /usr/sbin/acpid
No /usr/sbin/acpid found running; none killed.
Finding and reading this bug report, it seems that prelink is indeed
the culprit; prelink did a full prelink the previous night, which
modified the acpid executable, so --exec was unable to match the
running acpid with the one residing on disk.
So, I created a patch that modifies the behaviour of --exec in the
way suggested by the last e-mail in this bug report: use readlink() on
/proc/<pid>/exe instead of stat(), and compare the file names rather
than the device / inode numbers. I have tested it, and it appears to
work fine and correct this problem. The patch is attached.
Thanks,
Vasilis
--
Vasilis Vasaitis
"A man is well or woe as he thinks himself so."
--- start-stop-daemon.c.orig 2005-10-19 08:07:28.000000000 +0000
+++ start-stop-daemon.c 2006-01-22 13:07:58.000000000 +0000
@@ -174,7 +174,9 @@
static void do_pidfile(const char *name);
static void do_stop(int signal_nr, int quietmode,
int *n_killed, int *n_notkilled, int retry_nr);
-#if defined(OSLinux) || defined(OShpux)
+#if defined(OSLinux)
+static int pid_is_exec(pid_t pid, const char *name);
+#elif defined(OShpux)
static int pid_is_exec(pid_t pid, const struct stat *esb);
#endif
@@ -608,15 +610,28 @@
#if defined(OSLinux)
static int
-pid_is_exec(pid_t pid, const struct stat *esb)
+pid_is_exec(pid_t pid, const char *name)
{
- struct stat sb;
- char buf[32];
-
- sprintf(buf, "/proc/%d/exe", pid);
- if (stat(buf, &sb) != 0)
+ char lname[32];
+ char *lcontents;
+ int lcontlen, nread, res;
+
+ /* allow one extra character for the link contents, which
+ * should be enough to determine if the file names are the
+ * same */
+ lcontlen = strlen(name) + 1;
+ lcontents = xmalloc(lcontlen);
+ sprintf(lname, "/proc/%d/exe", pid);
+ if ((nread = readlink(lname, lcontents, lcontlen)) == -1) {
+ free(lcontents);
return 0;
- return (sb.st_dev == esb->st_dev && sb.st_ino == esb->st_ino);
+ }
+ if (nread < lcontlen)
+ lcontents[nread] = '\0';
+
+ res = (strncmp(lcontents, name, lcontlen) == 0);
+ free(lcontents);
+ return res;
}
@@ -746,7 +761,9 @@
static void
check(pid_t pid)
{
-#if defined(OSLinux) || defined(OShpux)
+#if defined(OSLinux)
+ if (execname && !pid_is_exec(pid, execname))
+#elif defined(OShpux)
if (execname && !pid_is_exec(pid, &exec_stat))
#elif defined(OSHURD) || defined(OSFreeBSD) || defined(OSNetBSD)
/* I will try this to see if it works */