Thanks for your fast reply.
I tried 299 second and it works, than 301 seconds and it doesn't work.
I found out, that the max_switch_wait is one of the SchedulerParameters (slurm.conf) and it is as you said, the default is 300 seconds.
Thank you for solving my problem.

Kind regards,
Danny


Am 26.01.2016 um 21:00 schrieb Antony Cleave:
Re: [slurm-dev] lua bug wait4switch

Try setting it less than 5 min.

If that works then it might be the max_switch_wait setting for the cluster. I forget exactly where to set it now but I'm having a strong sense of deja vu reading this and I remember thinking that's a very short default when I changed it to 24 hours.

Antony

On 26 Jan 2016 10:08, "Danny Rotscher" <[email protected] <mailto:[email protected]>> wrote:


    Hello,

    there seems to be a bug with the lua job submit plugin.
    I want to change the variable wait4switch to a specific value, but
    my changes are ignored by Slurm.
    The value is always 5 minutes.

    job_submit.lua.all:

    [...]
    function slurm_job_submit(job_desc, part_list, submit_uid)
        [...]
        job_desc.wait4switch=6000
        [...]

    scontrol output of the job:

    JobId=391 JobName=sleep
       UserId=rotscher(19426) GroupId=rotscher(19426)
       Priority=25291 Nice=0 Account=rotscher QOS=normal
       JobState=RUNNING Reason=None Dependency=(null)
       Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
       RunTime=00:00:33 TimeLimit=08:00:00 TimeMin=N/A
       SubmitTime=2016-01-26T10:38:39 EligibleTime=2016-01-26T10:38:39
       StartTime=2016-01-26T10:38:39 EndTime=2016-01-26T18:38:39
       PreemptTime=None SuspendTime=None SecsPreSuspend=0
       Partition=sandy2 AllocNode:Sid=sandbox-login:13646
       ReqNodeList=(null) ExcNodeList=(null)
       NodeList=sandbox-node1
       BatchHost=sandbox-node1
       NumNodes=1 NumCPUs=4 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
       TRES=cpu=4,mem=40,node=1
       Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
       MinCPUsNode=1 MinMemoryCPU=10M MinTmpDiskNode=0
       Features=(null) Gres=(null) Reservation=(null)
       Shared=0 Contiguous=0 Licenses=(null) Network=(null)
       Command=sleep
       WorkDir=/tmp
       Switches=1@00:05:00

       Power= SICP=0

    I created some debug output which prints out the
    job_desc.wait4switch variable at the end of the job submit script.
    The output shows me the correct value.

    Thank you for any help!

    Kind regards,
    Danny


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Danny Rotscher
HPC-Support

Technische Universität Dresden
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
01062 Dresden
Tel.: +49 351 463-35853
Fax : +49 351 463-37773
E-Mail: [email protected]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reply via email to