Package: at
Version: 3.1.10.2
Tags: patch

Problem description:
When a running /usr/sbin/atd process did a cycle through run_loop()
while the mtime of the atjobs directory happened to be set to a
future time, this atd process will no more execute any at jobs until
the system time catches up to the time that happened to be set as
the mtime of the atjobs directory.  Both existing and newly created
at jobs are affected and get delayed.

Discussion of workarounds:
Just resetting the mtime of the atjobs directory without restarting
the atd does not help.  Just restarting the atd without first
resetting the mtime of the atjobs directory does not help.  Just
submitting yet another new job does not help.  As a workaround,
run the following two commands in the following order:
  # touch /var/spool/atjobs
  # /etc/init.d/atd restart
After this, both existing and newly created jobs will again be
executed correctly.

Description of the bug:
The atd program uses a
  static time_t last_chg
global variable.
Whenever entering run_loop(), it is compared to the mtime of the
atjobs directory.  If last_chg is greater than or equal to the mtime,
run_loop() returns immediately, causing the atd to sleep for a full
hour.
Only when last_chg is less than the mtime, processing continues
and last_chg is reset to the mtime of the atjobs directory.

Description of the fix:
Bail out of run_loop() only if last_chg is equal to the mtime.
That way, we are reasonably sure that the atjobs directory will
be reread completely after submitting a new job, whatever the
directory mtime might have been before submission of the new job.

Rationale:
The mtime is only used for performance reasons, so the processing
in run_loop should only be skipped when we are really sure that
nothing changed.  In case of doubt, we should _not_ bail out.
With my fix, we rely on the following assumption:
  "If the mtime of the directory didn't change since we last looked,
   the contents of the directory didn't change, either."  
  That's more or less reasonable (even though not strictly safe).
Without my fix, we rely on the following assumption:
  "If the new mtime of the directory is less than or equal to
   the mtime when we last looked, the contents of the directory
   didn't change."
  That's clearly a bad assumption.  An mtime rarely moves backwards,
  but when it does, it clearly indicates that something did change.


The following log demonstrating the problem is from a SuSE Linux
Enterprise Server 10 system running at-3.1.8 with SuSE patches, but
reading the Debian source code, it is obvious that all versions of
Debian are affected by the same problem:

# /etc/init.d/atd start
Starting service at daemon          done
# touch -t 200912241630 /var/spool/atjobs
# pkill -HUP atd
# echo -n | at now
warning: commands will be executed using /bin/sh
job 2 at 2009-05-14 11:35
# ls -ald /var/spool/atjobs
drwx------ 2 at at 4096 May 14 11:35 /var/spool/atjobs
# date
Thu May 14 11:36:04 CEST 2009
# atq
2       2009-05-14 11:35 a root
# pkill -HUP atd
# date
Thu May 14 11:36:14 CEST 2009
# atq
2       2009-05-14 11:35 a root
 
The job will not be executed before Christmas.
Correct behaviour would be to execute the job at once.


#
#  Patch to fix atjobs mtime handling in at-3.1.8 and at-3.1.10.2
#  Copyright (C) 2009 Astaro AG  www.astaro.com
#  Author: Ingo Schwarze <[email protected]> 14.05.2009
#
#  This patch is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This patch is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#  along with the at software; if not, write to the Free Software
#  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA,
#  or look up http://www.gnu.org/licenses/old-licenses/gpl-2.0.txt .
#
--- atd.c.dist  2009-05-13 20:03:17.000000000 +0200
+++ atd.c       2009-05-14 11:53:29.000000000 +0200
@@ -499,7 +499,7 @@
     if (stat(".", &buf) == -1)
        perr("Cannot stat " ATJOB_DIR);
 
-    if (nothing_to_do && buf.st_mtime <= last_chg)
+    if (nothing_to_do && buf.st_mtime == last_chg)
        return next_job;
     last_chg = buf.st_mtime;
 



-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to