I've stumbled onto this bug a little while ago, went through the full debugging cycle and when I wanted to report it, I found this existing bug report. I shelved it and just today came back to it and I would like to add a remark in reply to message #29:

You're correct that there is a bug in the logging function, in that the assumptions of UNIXes have changed underneath vixie cron -- namely, syslog does not allow callers to specify a PID at all. The relevant lines from the package source (misc.c):


        #if defined(SYSLOG)
        # ifdef LOG_CRON
            openlog(ProgramName, LOG_PID, LOG_CRON);
        # else
            openlog(ProgramName, LOG_PID);
        # endif

            syslog(LOG_INFO, "(%s) %s (%s)", username, event, detail);


So when logging to syslog, the PID passed to the log function is never used, due to a limitation of syslog.

Thus, the fix needs to be made in a different place, and - funnily and frustratingly enough - the fix has already been made (probably in 3.0pl1-117), except it is not used by default!

do_command.c is responsible for executing the actual cron command. In the original vixie cron source,

    log_it(usernm, getpid(), "CMD", x);

is called from the child process after fork(). However, the "END" logging (which is not in vixie-cron originally but added by the Selective logging patch) is done by the parent process, thus the logged PIDs will differ when used with Syslog.

The same Selective logging patch added a version of the logging in the default branch of the fork() switch, so if the -L log levels for "log job start" and "log job pid" are set, the starting PID is not logged by the child but the parent process instead.

So basically there is now what seems to me to be a "do things right" flag - if log level includes 8 (log PIDs) then both CMD and END messages are sent by the same process and contain the same correct PIDs:

    Apr  8 10:17:56 e02fc37faf65 CRON[27]: (root) CMD ([28] /tmp/runner.sh >>/tmp/runner.log)     Apr  8 10:19:12 e02fc37faf65 CRON[27]: (root) END ([28] /tmp/runner.sh >>/tmp/runner.log)

(PID 27 is the cron parent, PID 28 is the command child, PID 29 is the PID of the actual command). If the log level includes only e.g. "log start" and "log end", then the PIDs will differ:

    Apr  8 10:14:06 2d9c73749325 CRON[28]: (root) CMD (/tmp/runner.sh >>/tmp/runner.log)     Apr  8 10:15:27 2d9c73749325 CRON[27]: (root) END (/tmp/runner.sh >>/tmp/runner.log)

(PID 28 is the command child which sends the CMD message, PID 27 is the cron parent which sends the END message, the actual command is PID 29)

I would like to propose (and intend on submitting a patch soon) to always log in the same place. Ideally, that would be the child process, so that the PID that openlog() uses and the PID that cron would log are the same, but I'm not sure that's possible in a reliable way. Doing it in the parent is just as well for me, though - my original intent was trying to match CMDs to ENDs in the logs of a wildly active system.

Curious to hear your thoughts!


Best
Teal

Reply via email to