I've stumbled onto this bug a little while ago, went through the full
debugging cycle and when I wanted to report it, I found this existing
bug report. I shelved it and just today came back to it and I would like
to add a remark in reply to message #29:
You're correct that there is a bug in the logging function, in that the
assumptions of UNIXes have changed underneath vixie cron -- namely,
syslog does not allow callers to specify a PID at all. The relevant
lines from the package source (misc.c):
#if defined(SYSLOG)
# ifdef LOG_CRON
openlog(ProgramName, LOG_PID, LOG_CRON);
# else
openlog(ProgramName, LOG_PID);
# endif
syslog(LOG_INFO, "(%s) %s (%s)", username, event, detail);
So when logging to syslog, the PID passed to the log function is never
used, due to a limitation of syslog.
Thus, the fix needs to be made in a different place, and - funnily and
frustratingly enough - the fix has already been made (probably in
3.0pl1-117), except it is not used by default!
do_command.c is responsible for executing the actual cron command. In
the original vixie cron source,
log_it(usernm, getpid(), "CMD", x);
is called from the child process after fork(). However, the "END"
logging (which is not in vixie-cron originally but added by the
Selective logging patch) is done by the parent process, thus the logged
PIDs will differ when used with Syslog.
The same Selective logging patch added a version of the logging in the
default branch of the fork() switch, so if the -L log levels for "log
job start" and "log job pid" are set, the starting PID is not logged by
the child but the parent process instead.
So basically there is now what seems to me to be a "do things right"
flag - if log level includes 8 (log PIDs) then both CMD and END messages
are sent by the same process and contain the same correct PIDs:
Apr 8 10:17:56 e02fc37faf65 CRON[27]: (root) CMD ([28]
/tmp/runner.sh >>/tmp/runner.log)
Apr 8 10:19:12 e02fc37faf65 CRON[27]: (root) END ([28]
/tmp/runner.sh >>/tmp/runner.log)
(PID 27 is the cron parent, PID 28 is the command child, PID 29 is the
PID of the actual command).
If the log level includes only e.g. "log start" and "log end", then the
PIDs will differ:
Apr 8 10:14:06 2d9c73749325 CRON[28]: (root) CMD (/tmp/runner.sh
>>/tmp/runner.log)
Apr 8 10:15:27 2d9c73749325 CRON[27]: (root) END (/tmp/runner.sh
>>/tmp/runner.log)
(PID 28 is the command child which sends the CMD message, PID 27 is the
cron parent which sends the END message, the actual command is PID 29)
I would like to propose (and intend on submitting a patch soon) to
always log in the same place.
Ideally, that would be the child process, so that the PID that openlog()
uses and the PID that cron would log are the same, but I'm not sure
that's possible in a reliable way. Doing it in the parent is just as
well for me, though - my original intent was trying to match CMDs to
ENDs in the logs of a wildly active system.
Curious to hear your thoughts!
Best
Teal