Hi Carles,

Carles Fenoy <[email protected]> writes:

> Re: [slurm-dev] Re: NumNodes changed for requeued job? 
>
> Hi Loris,
>
> AFAIK for a job to be able to use more nodes than requested it should be:
>
> #SBATCH -N5-
>
> note the dash at the end. That specifies that the minimum number of
> nodes is 5 and there is no maximum

Thanks for the information.  In our version of the man page for 2.2.7 it
says

,---------------------------------------------------------------------
| -N, --nodes=<minnodes[-maxnodes]>                                
|        Request that a minimum of minnodes nodes be allocated to this
|        job.  The scheduler may decide to launch the job on more than
|        minnodes nodes.
`---------------------------------------------------------------------

I see that on the website that this now reads

,-------------------------------------------------------------------------
| -N, --nodes=<minnodes[-maxnodes]>                                       
|     Request that a minimum of minnodes nodes be allocated to this job. A
|     maximum node count may also be specified with maxnodes. If only one
|     number is specified, this is used as both the minimum and maximum
|     node count.
`-------------------------------------------------------------------------

So this bug in the documentation has obviously been fixed.

Regards

Loris

> regards,
> Carles Fenoy
>
> On Fri, Oct 19, 2012 at 11:11 AM, Loris Bennett
> <[email protected]> wrote:
>
>     Seren Soner <[email protected]> writes:
>     
>     > Re: [slurm-dev] NumNodes changed for requeued job?
>     
>     >
>     > You can normally request a range of number of nodes, i.e.
>     >
>     > #SBATCH -N 5-10
>     >
>     > Requesting an exact number of nodes is equivalent to setting the
>     > minimum
>     and
>     > maximum number of nodes requested. After the job is scheduled,
>     > scontrol
>     show jo
>     > will convert the NumNodes field to the exact number of nodes
>     > that the job
>     uses.
>     
>     
>     I realise that I was misinterpreting "Requeue=1" as meaning that the job
>     had been requeued once, rather than that requeueing is allowed.
>     
>     However, I still don't see why setting
>     
>     #SBATCH -N 5
>     
>     results in
>     
>     NumNodes=5-5
>     
>     for a job which is still pending. Shouldn't it still be possible to run
>     the job on more than 5 nodes, if the number of tasks can be satisfied
>     that way?
>     
>     Cheers,
>     
>     Loris
>     
>     
>     
>     
>     > On Fri, Oct 19, 2012 at 11:00 AM, Loris Bennett
>     <[email protected]>
>     > wrote:
>     >
>     > Hi,
>     >
>     > I have noticed that for a job currently pending which was started with
>     >
>     > #SBATCH -N5
>     > #SBATCH -n12
>     >
>     > 'scontrol show job' gives
>     >
>     > Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
>     > ...
>     > NumNodes=5-5 NumCPUs=12 CPUs/Task=1 ReqS:C:T=*:*:*
>     >
>     > Is the change of the number of nodes from '5' to '5-5' a result of the
>     > requeueing and, if so, is this the desired behaviour?
>     >
>     > I'm using version 2.2.7.
>     >
>     > Cheers,
>     >
>     > Loris
>     >
>     > --
>     > Dr. Loris Bennett (Mr.)
>     > ZEDAT, Freie Universität Berlin Email
>     [email protected]
>     >
>     >
>     
>     --
>     Dr. Loris Bennett (Mr.)
>     ZEDAT, Freie Universität Berlin Email
>     [email protected]

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email [email protected]

Reply via email to