Package: at Version: 3.1.10.2 Tags: patch Problem description: When a running /usr/sbin/atd process did a cycle through run_loop() while the mtime of the atjobs directory happened to be set to a future time, this atd process will no more execute any at jobs until the system time catches up to the time that happened to be set as the mtime of the atjobs directory. Both existing and newly created at jobs are affected and get delayed.
Discussion of workarounds: Just resetting the mtime of the atjobs directory without restarting the atd does not help. Just restarting the atd without first resetting the mtime of the atjobs directory does not help. Just submitting yet another new job does not help. As a workaround, run the following two commands in the following order: # touch /var/spool/atjobs # /etc/init.d/atd restart After this, both existing and newly created jobs will again be executed correctly. Description of the bug: The atd program uses a static time_t last_chg global variable. Whenever entering run_loop(), it is compared to the mtime of the atjobs directory. If last_chg is greater than or equal to the mtime, run_loop() returns immediately, causing the atd to sleep for a full hour. Only when last_chg is less than the mtime, processing continues and last_chg is reset to the mtime of the atjobs directory. Description of the fix: Bail out of run_loop() only if last_chg is equal to the mtime. That way, we are reasonably sure that the atjobs directory will be reread completely after submitting a new job, whatever the directory mtime might have been before submission of the new job. Rationale: The mtime is only used for performance reasons, so the processing in run_loop should only be skipped when we are really sure that nothing changed. In case of doubt, we should _not_ bail out. With my fix, we rely on the following assumption: "If the mtime of the directory didn't change since we last looked, the contents of the directory didn't change, either." That's more or less reasonable (even though not strictly safe). Without my fix, we rely on the following assumption: "If the new mtime of the directory is less than or equal to the mtime when we last looked, the contents of the directory didn't change." That's clearly a bad assumption. An mtime rarely moves backwards, but when it does, it clearly indicates that something did change. The following log demonstrating the problem is from a SuSE Linux Enterprise Server 10 system running at-3.1.8 with SuSE patches, but reading the Debian source code, it is obvious that all versions of Debian are affected by the same problem: # /etc/init.d/atd start Starting service at daemon done # touch -t 200912241630 /var/spool/atjobs # pkill -HUP atd # echo -n | at now warning: commands will be executed using /bin/sh job 2 at 2009-05-14 11:35 # ls -ald /var/spool/atjobs drwx------ 2 at at 4096 May 14 11:35 /var/spool/atjobs # date Thu May 14 11:36:04 CEST 2009 # atq 2 2009-05-14 11:35 a root # pkill -HUP atd # date Thu May 14 11:36:14 CEST 2009 # atq 2 2009-05-14 11:35 a root The job will not be executed before Christmas. Correct behaviour would be to execute the job at once. # # Patch to fix atjobs mtime handling in at-3.1.8 and at-3.1.10.2 # Copyright (C) 2009 Astaro AG www.astaro.com # Author: Ingo Schwarze <[email protected]> 14.05.2009 # # This patch is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This patch is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with the at software; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA, # or look up http://www.gnu.org/licenses/old-licenses/gpl-2.0.txt . # --- atd.c.dist 2009-05-13 20:03:17.000000000 +0200 +++ atd.c 2009-05-14 11:53:29.000000000 +0200 @@ -499,7 +499,7 @@ if (stat(".", &buf) == -1) perr("Cannot stat " ATJOB_DIR); - if (nothing_to_do && buf.st_mtime <= last_chg) + if (nothing_to_do && buf.st_mtime == last_chg) return next_job; last_chg = buf.st_mtime; -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

