Hi All,

It does look like there is a bug in time_str2secs():

If we give it a time of format: days-0:min, we exit the for loop with days set, but with min set to our hours value (zero in this case) and sec set to our min value, and hr still set to -1.

I think the fix is:
Index: slurm-2.6.2/src/common/parse_time.c
===================================================================
--- slurm-2.6.2.orig/src/common/parse_time.c 2013-10-02 16:54:02.545959036 +1000
+++ slurm-2.6.2/src/common/parse_time.c 2013-10-02 17:11:55.945964039 +1000
@@ -689,7 +689,7 @@ extern int time_str2secs(const char *str
                        break;
        }

-       if ((days != -1) && (hr == -1) && (min != 0)) {
+       if ((days != -1) && (hr == -1) && (min != -1)) {
                /* format was "days-hr" or "days-hr:min" */
                hr = min;
                min = sec;


so with the fix we come out of our for loop and if we have set days and minutes to something but not hours then we need to shift our minutes to become our hours, etc. (so we need to check that minutes has been set to something - that it isn't '-1' rather than that it isn't '0' - that was the quick way of explaining it ;) )

How does that sound?

Thanks!
Mark


On 02/10/13 15:07, Christopher Samuel wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi folks,

We just had a user complain that a job of theirs that requested:

- --time=0-0:10

was being shown as only wanting 1 minute.

Testing with Slurm 2.6.1 (on x86) and 2.6.2 (on BG/Q) we see the same
behaviour where hours are 0, thus:

[samuel@avoca ~]$ salloc -p debug --time=0-0:50
salloc: Required node not available (down or drained)
salloc: Pending job allocation 485229
salloc: job 485229 queued and waiting for resources
^Csalloc: Job allocation 485229 has been revoked.
salloc: Job aborted due to signal

also results in a job only wanting 1 minute (as reported by scontrol
and sacct).

[samuel@avoca ~]$ sacct -j 485229 -o Timelimit
  Timelimit
- ----------
   00:01:00

However, if I make that 1 hour and 50 minutes:

[samuel@avoca ~]$ salloc -p debug --time=0-1:50
salloc: Required node not available (down or drained)
salloc: Pending job allocation 485230
salloc: job 485230 queued and waiting for resources
^Csalloc: Job allocation 485230 has been revoked.
salloc: Job aborted due to signal

then it appears correctly:

[samuel@avoca ~]$ sacct -j 485230 -o Timelimit
  Timelimit
- ----------
   01:50:00


I've had a look at the code for time_str2secs() in
src/common/parse_time.c but as a sysadmin rather than a programmer I'm
struggling to follow it, let alone see where the bug is. ;-)

I'm pretty sure this is a bug, but I'd appreciate knowing if I've
missed something!

All the best,
Chris
- --
  Christopher Samuel        Senior Systems Administrator
  VLSCI - Victorian Life Sciences Computation Initiative
  Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
  http://www.vlsci.org.au/      http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlJLqcQACgkQO2KABBYQAh/jOgCgh9jabfQnrjlkB7KD4ve1cDAQ
hwMAoIctf3wZ2c9TtV5QXboyg2Yhl/W3
=E7wZ
-----END PGP SIGNATURE-----

Reply via email to