subject:"\[slurm\-users\] not allocating jobs even resources are free"

Re: [slurm-users] not allocating jobs even resources are free

2020-05-04 Thread navin srivastava

Thanks Denial for detailed  Description

Regards
Navin

On Sun, May 3, 2020, 13:35 Daniel Letai  wrote:

>
> On 29/04/2020 12:00:13, navin srivastava wrote:
>
> Thanks Daniel.
>
> All jobs went into run state so unable to provide the details but
> definitely will reach out later if we see similar issue.
>
> i am more interested to understand the FIFO with Fair Tree.it will be good
> if anybody provide some insight on this combination and also if we will
> enable the backfilling here how the behaviour will change.
>
> what is the role of the Fair tree here?
>
> Fair tree is the algorithm used to calculate the interim priority, before
> applying weight, but I think after the halflife decay.
>
>
> To make it simple - fifo without fairshare would assign priority based
> only on submission time. With faishare, that naive priority is adjusted
> based on prior usage by the applicable entities (users/departments -
> accounts).
>
>
> Backfill will let you utilize your resources better, since it will allow
> "inserting" low priority jobs before higher priority jobs, provided all
> jobs have defined wall times, and any inserted job doesn't affect in any
> way the start time of a higher priority job, thus allowing utilization of
> "holes" when the scheduler waits for resources to free up, in order to
> insert some large job.
>
>
> Suppose the system is at 60% utilization of cores, and the next fifo job
> requires 42% - it will wait until 2% are free so it can begin, meanwhile
> not allowing any job to start, even if it would tke only 30% of the
> resources (whic are currently free) and would finish before the 2% are free
> anyway.
>
> Backfill would allow such job to start, as long as it's wall time ensures
> it would finish before the 42% job would've started.
>
>
> Fairtree in either case (fifo or backfill) calculates the priority for
> each job the same - if the account had used more resources recently (the
> halflife decay factor) it would get a lower priority even though it was
> submitted earlier than a job from an account that didn't use any resources
> recently.
>
>
> As can be expected, backtree has to loop over all jobs in the queue, in
> order to see if any job can fit out of order. In very busy/active systems,
> that can lead to poor response times, unless tuned correctly in slurm conf
> - look at SchedulerParameters, all params starting with bf_ and in
> particular bf_max_job_test= ,bf_max_time= and bf_continue (but bf_window=
> can also have some impact if set too high).
>
> see the man page at
> https://slurm.schedmd.com/slurm.conf.html#OPT_SchedulerParameters
>
>
> PriorityType=priority/multifactor
> PriorityDecayHalfLife=2
> PriorityUsageResetPeriod=DAILY
> PriorityWeightFairshare=50
> PriorityFlags=FAIR_TREE
>
> Regards
> Navin.
>
>
>
> On Mon, Apr 27, 2020 at 9:37 PM Daniel Letai  wrote:
>
>> Are you sure there are enough resources available? The node is in mixed
>> state, so it's configured for both partitions - it's possible that earlier
>> lower priority jobs are already running thus blocking the later jobs,
>> especially since it's fifo.
>>
>>
>> It would really help if you pasted the results of:
>>
>> squeue
>>
>> sinfo
>>
>>
>> As well as the exact sbatch line, so we can see how many resources per
>> node are requested.
>>
>>
>> On 26/04/2020 12:00:06, navin srivastava wrote:
>>
>> Thanks Brian,
>>
>> As suggested i gone through document and what i understood  that the fair
>> tree leads to the Fairshare mechanism and based on that the job should be
>> scheduling.
>>
>> so it mean job scheduling will be based on FIFO but priority will be
>> decided on the Fairshare. i am not sure if both conflicts here.if i see the
>> normal jobs priority is lower than the GPUsmall priority. so resources are
>> available with gpusmall partition then it should go. there is no job pend
>> due to gpu resources. the gpu resources itself not asked with the job.
>>
>> is there any article where i can see how the fairshare works and which
>> are setting should not be conflict with this.
>> According to document it never says that if fair-share is applied then
>> FIFO should be disabled.
>>
>> Regards
>> Navin.
>>
>>
>>
>>
>>
>> On Sat, Apr 25, 2020 at 12:47 AM Brian W. Johanson 
>> wrote:
>>
>>>
>>> If you haven't looked at the man page for slurm.conf, it will answer
>>> most if not all your questions.
>>> https://slurm.schedmd.com/slurm.conf.html but I would depend on the the
>>> manual version that was distributed with the version you have installed as
>>> options do change.
>>>
>>> There is a ton of information that is tedious to get through but reading
>>> through it multiple times opens many doors.
>>>
>>> DefaultTime is listed in there as a Partition option.
>>> If you are scheduling gres/gpu resources, it's quite possible there are
>>> cores available with no corresponding gpus avail.
>>>
>>> -b
>>>
>>> On 4/24/20 2:49 PM, navin srivastava wrote:
>>>
>>> Thanks Brian.
>>>
>>> I need  to check the jobs

Re: [slurm-users] not allocating jobs even resources are free

2020-05-03 Thread Daniel Letai


  
  

On 29/04/2020 12:00:13, navin
  srivastava wrote:


  
  Thanks Daniel.
 
All jobs went into run state so unable to provide the
  details but definitely will reach out later if we see similar
  issue.


i am more interested to understand the FIFO with Fair
  Tree.it will be good if anybody provide some insight on this
  combination and also if we will enable the backfilling here
  how the behaviour will change.


what is the role of the Fair tree here?

  

Fair tree is the algorithm used to calculate the interim
  priority, before applying weight, but I think after the halflife
  decay.


To make it simple - fifo without fairshare would assign priority
  based only on submission time. With faishare, that naive priority
  is adjusted based on prior usage by the applicable entities
  (users/departments - accounts).


Backfill will let you utilize your resources better, since it
  will allow "inserting" low priority jobs before higher priority
  jobs, provided all jobs have defined wall times, and any inserted
  job doesn't affect in any way the start time of a higher priority
  job, thus allowing utilization of "holes" when the scheduler waits
  for resources to free up, in order to insert some large job.


Suppose the system is at 60% utilization of cores, and the next
  fifo job requires 42% - it will wait until 2% are free so it can
  begin, meanwhile not allowing any job to start, even if it would
  tke only 30% of the resources (whic are currently free) and would
  finish before the 2% are free anyway.
Backfill would allow such job to start, as long as it's wall time
  ensures it would finish before the 42% job would've started.


Fairtree in either case (fifo or backfill) calculates the
  priority for each job the same - if the account had used more
  resources recently (the halflife decay factor) it would get a
  lower priority even though it was submitted earlier than a job
  from an account that didn't use any resources recently.


As can be expected, backtree has to loop over all jobs in the
  queue, in order to see if any job can fit out of order. In very
  busy/active systems, that can lead to poor response times, unless
  tuned correctly in slurm conf - look at SchedulerParameters, all
  params starting with bf_ and in particular bf_max_job_test=
  ,bf_max_time= and bf_continue (but bf_window= can also have some
  impact if set too high).

see the man page at
  https://slurm.schedmd.com/slurm.conf.html#OPT_SchedulerParameters


  


PriorityType=priority/multifactor

PriorityDecayHalfLife=2
  PriorityUsageResetPeriod=DAILY
  PriorityWeightFairshare=50
  PriorityFlags=FAIR_TREE



Regards

Navin.




  
  
  
On Mon, Apr 27, 2020 at 9:37
  PM Daniel Letai  wrote:


  
Are you sure there are enough resources available? The
  node is in mixed state, so it's configured for both
  partitions - it's possible that earlier lower priority
  jobs are already running thus blocking the later jobs,
  especially since it's fifo.


It would really help if you pasted the results of:
squeue
sinfo


As well as the exact sbatch line, so we can see how many
  resources per node are requested.



On 26/04/2020 12:00:06, navin srivastava wrote:


  Thanks Brian,


As suggested i gone through document and what i
  understood  that the fair tree leads to the Fairshare
  mechanism and based on that the job should be
  scheduling.


so it mean job scheduling will be based on FIFO but
  priority will be decided on the Fairshare. i am not
  sure if both conflicts here.if i see the normal jobs
  priority is lower than the GPUsmall priority. so
  resources are available with gpusmall partition then
  it should go. there is no job pend due to gpu
  resources. the gpu resources itself not asked with the
  job.


is there any article where i can see how the
  fairshare works and which are setting should not

Re: [slurm-users] not allocating jobs even resources are free

2020-04-29 Thread Brian W. Johanson


Navin,
Check out 'sprio', this will give show you how the job priority changes 
with the weight changes you are making.

-b

On 4/29/20 5:00 AM, navin srivastava wrote:

Thanks Daniel.
All jobs went into run state so unable to provide the details but 
definitely will reach out later if we see similar issue.


i am more interested to understand the FIFO with Fair Tree.it will be 
good if anybody provide some insight on this combination and also if 
we will enable the backfilling here how the behaviour will change.


what is the role of the Fair tree here?

PriorityType=priority/multifactor
PriorityDecayHalfLife=2
PriorityUsageResetPeriod=DAILY
PriorityWeightFairshare=50
PriorityFlags=FAIR_TREE

Regards
Navin.



On Mon, Apr 27, 2020 at 9:37 PM Daniel Letai > wrote:


Are you sure there are enough resources available? The node is in
mixed state, so it's configured for both partitions - it's
possible that earlier lower priority jobs are already running thus
blocking the later jobs, especially since it's fifo.


It would really help if you pasted the results of:

squeue

sinfo


As well as the exact sbatch line, so we can see how many resources
per node are requested.


On 26/04/2020 12:00:06, navin srivastava wrote:

Thanks Brian,

As suggested i gone through document and what i understood  that
the fair tree leads to the Fairshare mechanism and based on that
the job should be scheduling.

so it mean job scheduling will be based on FIFO but priority will
be decided on the Fairshare. i am not sure if both conflicts
here.if i see the normal jobs priority is lower than the GPUsmall
priority. so resources are available with gpusmall partition then
it should go. there is no job pend due to gpu resources. the gpu
resources itself not asked with the job.

is there any article where i can see how the fairshare works and
which are setting should not be conflict with this.
According to document it never says that if fair-share is applied
then FIFO should be disabled.

Regards
Navin.





On Sat, Apr 25, 2020 at 12:47 AM Brian W. Johanson
mailto:bjoha...@psc.edu>> wrote:


If you haven't looked at the man page for slurm.conf, it will
answer most if not all your questions.
https://slurm.schedmd.com/slurm.conf.html but I would depend
on the the manual version that was distributed with the
version you have installed as options do change.

There is a ton of information that is tedious to get through
but reading through it multiple times opens many doors.

DefaultTime is listed in there as a Partition option.
If you are scheduling gres/gpu resources, it's quite possible
there are cores available with no corresponding gpus avail.

-b

On 4/24/20 2:49 PM, navin srivastava wrote:

Thanks Brian.

I need  to check the jobs order.

Is there  any way to define the default timeline of the job
if user  not specifying time limit.

Also what does the meaning of fairtree  in priorities in
slurm.Conf file.

The set of nodes are different in partitions.FIFO  does  not
care for any partitiong.
Is it like strict odering means the job came 1st will go and
until  it runs it will  not allow others.

Also priorities is high for gpusmall partition and low for
normal jobs and the nodes of the normal partition is full
but gpusmall cores are available.

Regards
Navin

On Fri, Apr 24, 2020, 23:49 Brian W. Johanson
mailto:bjoha...@psc.edu>> wrote:

Without seeing the jobs in your queue, I would expect
the next job in FIFO order to be too large to fit in the
current idle resources.

Configure it to use the backfill scheduler:
SchedulerType=sched/backfill

  SchedulerType
  Identifies  the type of scheduler to be
used.  Note the slurmctld daemon must be restarted for a
change in scheduler type to become effective
(reconfiguring a running daemon has no effect for this
parameter).  The scontrol command can be used to
manually change job priorities if desired.  Acceptable
values include:

  sched/backfill
 For a backfill scheduling module to
augment the default FIFO scheduling.  Backfill
scheduling will initiate lower-priority jobs if doing so
does not delay the expected initiation time of any 
higher  priority  job. Effectiveness  of  backfill
scheduling is dependent upon users specifying job time
limits, otherwise all jobs will have the same time limit
and backfilling is

Re: [slurm-users] not allocating jobs even resources are free

2020-04-29 Thread navin srivastava

Thanks Daniel.

All jobs went into run state so unable to provide the details but
definitely will reach out later if we see similar issue.

i am more interested to understand the FIFO with Fair Tree.it will be good
if anybody provide some insight on this combination and also if we will
enable the backfilling here how the behaviour will change.

what is the role of the Fair tree here?

PriorityType=priority/multifactor
PriorityDecayHalfLife=2
PriorityUsageResetPeriod=DAILY
PriorityWeightFairshare=50
PriorityFlags=FAIR_TREE

Regards
Navin.



On Mon, Apr 27, 2020 at 9:37 PM Daniel Letai  wrote:

> Are you sure there are enough resources available? The node is in mixed
> state, so it's configured for both partitions - it's possible that earlier
> lower priority jobs are already running thus blocking the later jobs,
> especially since it's fifo.
>
>
> It would really help if you pasted the results of:
>
> squeue
>
> sinfo
>
>
> As well as the exact sbatch line, so we can see how many resources per
> node are requested.
>
>
> On 26/04/2020 12:00:06, navin srivastava wrote:
>
> Thanks Brian,
>
> As suggested i gone through document and what i understood  that the fair
> tree leads to the Fairshare mechanism and based on that the job should be
> scheduling.
>
> so it mean job scheduling will be based on FIFO but priority will be
> decided on the Fairshare. i am not sure if both conflicts here.if i see the
> normal jobs priority is lower than the GPUsmall priority. so resources are
> available with gpusmall partition then it should go. there is no job pend
> due to gpu resources. the gpu resources itself not asked with the job.
>
> is there any article where i can see how the fairshare works and which are
> setting should not be conflict with this.
> According to document it never says that if fair-share is applied then
> FIFO should be disabled.
>
> Regards
> Navin.
>
>
>
>
>
> On Sat, Apr 25, 2020 at 12:47 AM Brian W. Johanson 
> wrote:
>
>>
>> If you haven't looked at the man page for slurm.conf, it will answer most
>> if not all your questions.
>> https://slurm.schedmd.com/slurm.conf.html but I would depend on the the
>> manual version that was distributed with the version you have installed as
>> options do change.
>>
>> There is a ton of information that is tedious to get through but reading
>> through it multiple times opens many doors.
>>
>> DefaultTime is listed in there as a Partition option.
>> If you are scheduling gres/gpu resources, it's quite possible there are
>> cores available with no corresponding gpus avail.
>>
>> -b
>>
>> On 4/24/20 2:49 PM, navin srivastava wrote:
>>
>> Thanks Brian.
>>
>> I need  to check the jobs order.
>>
>> Is there  any way to define the default timeline of the job if user  not
>> specifying time limit.
>>
>> Also what does the meaning of fairtree  in priorities in slurm.Conf file.
>>
>> The set of nodes are different in partitions.FIFO  does  not care for
>> any  partitiong.
>> Is it like strict odering means the job came 1st will go and until  it
>> runs it will  not allow others.
>>
>> Also priorities is high for gpusmall partition and low for normal jobs
>> and the nodes of the normal partition is full but gpusmall cores are
>> available.
>>
>> Regards
>> Navin
>>
>> On Fri, Apr 24, 2020, 23:49 Brian W. Johanson  wrote:
>>
>>> Without seeing the jobs in your queue, I would expect the next job in
>>> FIFO order to be too large to fit in the current idle resources.
>>>
>>> Configure it to use the backfill scheduler: SchedulerType=sched/backfill
>>>
>>>   SchedulerType
>>>   Identifies  the type of scheduler to be used.  Note the
>>> slurmctld daemon must be restarted for a change in scheduler type to become
>>> effective (reconfiguring a running daemon has no effect for this
>>> parameter).  The scontrol command can be used to manually change job
>>> priorities if desired.  Acceptable values include:
>>>
>>>   sched/backfill
>>>  For a backfill scheduling module to augment the
>>> default FIFO scheduling.  Backfill scheduling will initiate lower-priority
>>> jobs if doing so does not delay the expected initiation time of any
>>> higher  priority  job.   Effectiveness  of  backfill scheduling is
>>> dependent upon users specifying job time limits, otherwise all jobs will
>>> have the same time limit and backfilling is impossible.  Note documentation
>>> for the SchedulerParameters option above.  This is the default
>>> configuration.
>>>
>>>   sched/builtin
>>>  This  is  the  FIFO scheduler which initiates jobs
>>> in priority order.  If any job in the partition can not be scheduled, no
>>> lower priority job in that partition will be scheduled.  An exception is
>>> made for jobs that can not run due to partition constraints (e.g. the time
>>> limit) or down/drained nodes.  In that case, lower priority jobs can be
>>> initiated and not impact the higher priority job.
>>>
>>>
>>>
>>>

Re: [slurm-users] not allocating jobs even resources are free

2020-04-27 Thread Daniel Letai


  
  
Are you sure there are enough resources available? The node is in
  mixed state, so it's configured for both partitions - it's
  possible that earlier lower priority jobs are already running thus
  blocking the later jobs, especially since it's fifo.


It would really help if you pasted the results of:
squeue
sinfo


As well as the exact sbatch line, so we can see how many
  resources per node are requested.



On 26/04/2020 12:00:06, navin
  srivastava wrote:


  
  Thanks Brian,


As suggested i gone through document and what i understood 
  that the fair tree leads to the Fairshare mechanism and based
  on that the job should be scheduling.


so it mean job scheduling will be based on FIFO but
  priority will be decided on the Fairshare. i am not sure if
  both conflicts here.if i see the normal jobs priority is lower
  than the GPUsmall priority. so resources are available with
  gpusmall partition then it should go. there is no job pend due
  to gpu resources. the gpu resources itself not asked with the
  job.


is there any article where i can see how the
  fairshare works and which are setting should not be conflict
  with this.
According to document it never says that if fair-share is
  applied then FIFO should be disabled.



Regards
Navin.








  
  
  
On Sat, Apr 25, 2020 at 12:47
  AM Brian W. Johanson  wrote:


   
If you haven't looked at the man page for slurm.conf, it
will answer most if not all your questions. 
https://slurm.schedmd.com/slurm.conf.html
but I would depend on the the manual version that was
distributed with the version you have installed as options
do change.

There is a ton of information that is tedious to get through
but reading through it multiple times opens many doors.

DefaultTime is listed in there as a Partition option. 
If you are scheduling gres/gpu resources, it's quite
possible there are cores available with no corresponding
gpus avail.

-b

On 4/24/20 2:49 PM, navin srivastava wrote:


  Thanks Brian. 


I need  to check the jobs order. 
  
  
  Is there  any way to define the
default timeline of the job if user  not specifying
time limit. 
  
  
  Also what does the meaning of
fairtree  in priorities in slurm.Conf file. 
  
  
  The set of nodes are different in
partitions.FIFO  does  not care for any 
partitiong. 
  Is it like strict odering means the
job came 1st will go and until  it runs it will  not
allow others.
  
  
  Also priorities is high for gpusmall
partition and low for normal jobs and the nodes of
the normal partition is full but gpusmall cores are
available.
  
  
  Regards 
  
  Navin 

  
  
  
On Fri, Apr 24, 2020,
  23:49 Brian W. Johanson 
  wrote:


   Without seeing the jobs in your queue, I
  would expect the next job in FIFO order to be too
  large to fit in the current idle resources. 
  
  Configure it to use the backfill scheduler: SchedulerType=sched/backfill

    SchedulerType
    Identifies  the type of scheduler to
  be used.  Note the slurmctld daemon must be
  restarted for a change in scheduler type to become
  effective (reconfiguring a running daemon has no
  effect for this parameter).  The scontrol command
  can be used to manually change job priorities if
  desired.  Acceptable values include:
  
    sched/backfill

Re: [slurm-users] not allocating jobs even resources are free

2020-04-26 Thread navin srivastava

Thanks Brian,

As suggested i gone through document and what i understood  that the fair
tree leads to the Fairshare mechanism and based on that the job should be
scheduling.

so it mean job scheduling will be based on FIFO but priority will be
decided on the Fairshare. i am not sure if both conflicts here.if i see the
normal jobs priority is lower than the GPUsmall priority. so resources are
available with gpusmall partition then it should go. there is no job pend
due to gpu resources. the gpu resources itself not asked with the job.

is there any article where i can see how the fairshare works and which are
setting should not be conflict with this.
According to document it never says that if fair-share is applied then FIFO
should be disabled.

Regards
Navin.





On Sat, Apr 25, 2020 at 12:47 AM Brian W. Johanson  wrote:

>
> If you haven't looked at the man page for slurm.conf, it will answer most
> if not all your questions.
> https://slurm.schedmd.com/slurm.conf.html but I would depend on the the
> manual version that was distributed with the version you have installed as
> options do change.
>
> There is a ton of information that is tedious to get through but reading
> through it multiple times opens many doors.
>
> DefaultTime is listed in there as a Partition option.
> If you are scheduling gres/gpu resources, it's quite possible there are
> cores available with no corresponding gpus avail.
>
> -b
>
> On 4/24/20 2:49 PM, navin srivastava wrote:
>
> Thanks Brian.
>
> I need  to check the jobs order.
>
> Is there  any way to define the default timeline of the job if user  not
> specifying time limit.
>
> Also what does the meaning of fairtree  in priorities in slurm.Conf file.
>
> The set of nodes are different in partitions.FIFO  does  not care for any
> partitiong.
> Is it like strict odering means the job came 1st will go and until  it
> runs it will  not allow others.
>
> Also priorities is high for gpusmall partition and low for normal jobs and
> the nodes of the normal partition is full but gpusmall cores are available.
>
> Regards
> Navin
>
> On Fri, Apr 24, 2020, 23:49 Brian W. Johanson  wrote:
>
>> Without seeing the jobs in your queue, I would expect the next job in
>> FIFO order to be too large to fit in the current idle resources.
>>
>> Configure it to use the backfill scheduler: SchedulerType=sched/backfill
>>
>>   SchedulerType
>>   Identifies  the type of scheduler to be used.  Note the
>> slurmctld daemon must be restarted for a change in scheduler type to become
>> effective (reconfiguring a running daemon has no effect for this
>> parameter).  The scontrol command can be used to manually change job
>> priorities if desired.  Acceptable values include:
>>
>>   sched/backfill
>>  For a backfill scheduling module to augment the
>> default FIFO scheduling.  Backfill scheduling will initiate lower-priority
>> jobs if doing so does not delay the expected initiation time of any
>> higher  priority  job.   Effectiveness  of  backfill scheduling is
>> dependent upon users specifying job time limits, otherwise all jobs will
>> have the same time limit and backfilling is impossible.  Note documentation
>> for the SchedulerParameters option above.  This is the default
>> configuration.
>>
>>   sched/builtin
>>  This  is  the  FIFO scheduler which initiates jobs
>> in priority order.  If any job in the partition can not be scheduled, no
>> lower priority job in that partition will be scheduled.  An exception is
>> made for jobs that can not run due to partition constraints (e.g. the time
>> limit) or down/drained nodes.  In that case, lower priority jobs can be
>> initiated and not impact the higher priority job.
>>
>>
>>
>> Your partitions are set with maxtime=INFINITE, if your users are not
>> specifying a reasonable timelimit to their jobs, this won't help either.
>>
>>
>> -b
>>
>>
>> On 4/24/20 1:52 PM, navin srivastava wrote:
>>
>> In addition to the above when i see the sprio of both the jobs it says :-
>>
>> for normal queue jobs all jobs showing the same priority
>>
>>  JOBID PARTITION   PRIORITY  FAIRSHARE
>> 1291352 normal   15789  15789
>>
>> for GPUsmall all jobs showing the same priority.
>>
>>  JOBID PARTITION   PRIORITY  FAIRSHARE
>> 1291339 GPUsmall  21052  21053
>>
>> On Fri, Apr 24, 2020 at 11:14 PM navin srivastava 
>> wrote:
>>
>>> Hi Team,
>>>
>>> we are facing some issue in our environment. The resources are free but
>>> job is going into the QUEUE state but not running.
>>>
>>> i have attached the slurm.conf file here.
>>>
>>> scenario:-
>>>
>>> There are job only in the 2 partitions:
>>>  344 jobs are in PD state in normal partition and the node belongs
>>> from the normal partitions are full and no more job can run.
>>>
>>> 1300 JOBS are in GPUsmall partition are in queue and enough CPU is
>>> avaiable to execute the jobs but i see the jobs are not

Re: [slurm-users] not allocating jobs even resources are free

2020-04-24 Thread Brian W. Johanson



If you haven't looked at the man page for slurm.conf, it will answer 
most if not all your questions.
https://slurm.schedmd.com/slurm.conf.html but I would depend on the the 
manual version that was distributed with the version you have installed 
as options do change.


There is a ton of information that is tedious to get through but reading 
through it multiple times opens many doors.


DefaultTime is listed in there as a Partition option.
If you are scheduling gres/gpu resources, it's quite possible there are 
cores available with no corresponding gpus avail.


-b

On 4/24/20 2:49 PM, navin srivastava wrote:

Thanks Brian.

I need  to check the jobs order.

Is there  any way to define the default timeline of the job if user  
not specifying time limit.


Also what does the meaning of fairtree  in priorities in slurm.Conf file.

The set of nodes are different in partitions.FIFO  does  not care for 
any  partitiong.
Is it like strict odering means the job came 1st will go and until  it 
runs it will  not allow others.


Also priorities is high for gpusmall partition and low for normal jobs 
and the nodes of the normal partition is full but gpusmall cores are 
available.


Regards
Navin

On Fri, Apr 24, 2020, 23:49 Brian W. Johanson > wrote:


Without seeing the jobs in your queue, I would expect the next job
in FIFO order to be too large to fit in the current idle resources.

Configure it to use the backfill scheduler:
SchedulerType=sched/backfill

  SchedulerType
  Identifies  the type of scheduler to be used.  Note
the slurmctld daemon must be restarted for a change in scheduler
type to become effective (reconfiguring a running daemon has no
effect for this parameter).  The scontrol command can be used to
manually change job priorities if desired.  Acceptable values include:

  sched/backfill
 For a backfill scheduling module to augment
the default FIFO scheduling.  Backfill scheduling will initiate
lower-priority jobs if doing so does not delay the expected
initiation time of any  higher priority  job.   Effectiveness  of 
backfill scheduling is dependent upon users specifying job time
limits, otherwise all jobs will have the same time limit and
backfilling is impossible.  Note documentation for the
SchedulerParameters option above.  This is the default configuration.

  sched/builtin
 This  is  the  FIFO scheduler which initiates
jobs in priority order.  If any job in the partition can not be
scheduled, no lower priority job in that partition will be
scheduled.  An exception is made for jobs that can not run due to
partition constraints (e.g. the time limit) or down/drained
nodes.  In that case, lower priority jobs can be initiated and not
impact the higher priority job.



Your partitions are set with maxtime=INFINITE, if your users are
not specifying a reasonable timelimit to their jobs, this won't
help either.


-b


On 4/24/20 1:52 PM, navin srivastava wrote:

In addition to the above when i see the sprio of both the jobs it
says :-

for normal queue jobs all jobs showing the same priority

 JOBID PARTITION   PRIORITY  FAIRSHARE
        1291352 normal           15789      15789

for GPUsmall all jobs showing the same priority.

 JOBID PARTITION   PRIORITY  FAIRSHARE
        1291339 GPUsmall      21052      21053

On Fri, Apr 24, 2020 at 11:14 PM navin srivastava
mailto:navin.alt...@gmail.com>> wrote:

Hi Team,

we are facing some issue in our environment. The resources
are free but job is going into the QUEUE state but not running.

i have attached the slurm.conf file here.

scenario:-

There are job only in the 2 partitions:
 344 jobs are in PD state in normal partition and the node
belongs from the normal partitions are full and no more job
can run.

1300 JOBS are in GPUsmall partition are in queue and enough
CPU is avaiable to execute the jobs but i see the jobs are
not scheduling on free nodes.

Rest there are no pend jobs in any other partition .
eg:-
node status:- node18

NodeName=node18 Arch=x86_64 CoresPerSocket=18
   CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
   AvailableFeatures=K2200
   ActiveFeatures=K2200
   Gres=gpu:2
   NodeAddr=node18 NodeHostName=node18 Version=17.11
   OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17 07:44:50
UTC 2018 (0b375e4)
   RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=GPUsmall,pm_shared
   BootTime=2019-12-10T14:16:37
SlurmdStartTime=2019-12-10T14:24:08

Re: [slurm-users] not allocating jobs even resources are free

2020-04-24 Thread navin srivastava

Thanks Brian.

I need  to check the jobs order.

Is there  any way to define the default timeline of the job if user  not
specifying time limit.

Also what does the meaning of fairtree  in priorities in slurm.Conf file.

The set of nodes are different in partitions.FIFO  does  not care for any
partitiong.
Is it like strict odering means the job came 1st will go and until  it runs
it will  not allow others.

Also priorities is high for gpusmall partition and low for normal jobs and
the nodes of the normal partition is full but gpusmall cores are available.

Regards
Navin

On Fri, Apr 24, 2020, 23:49 Brian W. Johanson  wrote:

> Without seeing the jobs in your queue, I would expect the next job in FIFO
> order to be too large to fit in the current idle resources.
>
> Configure it to use the backfill scheduler: SchedulerType=sched/backfill
>
>   SchedulerType
>   Identifies  the type of scheduler to be used.  Note the
> slurmctld daemon must be restarted for a change in scheduler type to become
> effective (reconfiguring a running daemon has no effect for this
> parameter).  The scontrol command can be used to manually change job
> priorities if desired.  Acceptable values include:
>
>   sched/backfill
>  For a backfill scheduling module to augment the
> default FIFO scheduling.  Backfill scheduling will initiate lower-priority
> jobs if doing so does not delay the expected initiation time of any
> higher  priority  job.   Effectiveness  of  backfill scheduling is
> dependent upon users specifying job time limits, otherwise all jobs will
> have the same time limit and backfilling is impossible.  Note documentation
> for the SchedulerParameters option above.  This is the default
> configuration.
>
>   sched/builtin
>  This  is  the  FIFO scheduler which initiates jobs in
> priority order.  If any job in the partition can not be scheduled, no lower
> priority job in that partition will be scheduled.  An exception is made for
> jobs that can not run due to partition constraints (e.g. the time limit) or
> down/drained nodes.  In that case, lower priority jobs can be initiated and
> not impact the higher priority job.
>
>
>
> Your partitions are set with maxtime=INFINITE, if your users are not
> specifying a reasonable timelimit to their jobs, this won't help either.
>
>
> -b
>
>
> On 4/24/20 1:52 PM, navin srivastava wrote:
>
> In addition to the above when i see the sprio of both the jobs it says :-
>
> for normal queue jobs all jobs showing the same priority
>
>  JOBID PARTITION   PRIORITY  FAIRSHARE
> 1291352 normal   15789  15789
>
> for GPUsmall all jobs showing the same priority.
>
>  JOBID PARTITION   PRIORITY  FAIRSHARE
> 1291339 GPUsmall  21052  21053
>
> On Fri, Apr 24, 2020 at 11:14 PM navin srivastava 
> wrote:
>
>> Hi Team,
>>
>> we are facing some issue in our environment. The resources are free but
>> job is going into the QUEUE state but not running.
>>
>> i have attached the slurm.conf file here.
>>
>> scenario:-
>>
>> There are job only in the 2 partitions:
>>  344 jobs are in PD state in normal partition and the node belongs
>> from the normal partitions are full and no more job can run.
>>
>> 1300 JOBS are in GPUsmall partition are in queue and enough CPU is
>> avaiable to execute the jobs but i see the jobs are not scheduling on free
>> nodes.
>>
>> Rest there are no pend jobs in any other partition .
>> eg:-
>> node status:- node18
>>
>> NodeName=node18 Arch=x86_64 CoresPerSocket=18
>>CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
>>AvailableFeatures=K2200
>>ActiveFeatures=K2200
>>Gres=gpu:2
>>NodeAddr=node18 NodeHostName=node18 Version=17.11
>>OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17 07:44:50 UTC 2018
>> (0b375e4)
>>RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2 Boards=1
>>State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
>>Partitions=GPUsmall,pm_shared
>>BootTime=2019-12-10T14:16:37 SlurmdStartTime=2019-12-10T14:24:08
>>CfgTRES=cpu=36,mem=1M,billing=36
>>AllocTRES=cpu=6
>>CapWatts=n/a
>>CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>>ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>>
>> node19:-
>>
>> NodeName=node19 Arch=x86_64 CoresPerSocket=18
>>CPUAlloc=16 CPUErr=0 CPUTot=36 CPULoad=15.43
>>AvailableFeatures=K2200
>>ActiveFeatures=K2200
>>Gres=gpu:2
>>NodeAddr=node19 NodeHostName=node19 Version=17.11
>>OS=Linux 4.12.14-94.41-default #1 SMP Wed Oct 31 12:25:04 UTC 2018
>> (3090901)
>>RealMemory=1 AllocMem=0 FreeMem=63998 Sockets=2 Boards=1
>>State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
>>Partitions=GPUsmall,pm_shared
>>BootTime=2020-03-12T06:51:54 SlurmdStartTime=2020-03-12T06:53:14
>>CfgTRES=cpu=36,mem=1M,billing=36
>>AllocTRES=cpu=16
>>CapWatts=n/a
>>CurrentWatts=0 LowestJoules=0

Re: [slurm-users] not allocating jobs even resources are free

2020-04-24 Thread Brian W. Johanson

Without seeing the jobs in your queue, I would expect the next job in 
FIFO order to be too large to fit in the current idle resources.


Configure it to use the backfill scheduler: SchedulerType=sched/backfill

  SchedulerType
  Identifies  the type of scheduler to be used.  Note the 
slurmctld daemon must be restarted for a change in scheduler type to 
become effective (reconfiguring a running daemon has no effect for this 
parameter).  The scontrol command can be used to manually change job 
priorities if desired.  Acceptable values include:


  sched/backfill
 For a backfill scheduling module to augment the 
default FIFO scheduling.  Backfill scheduling will initiate 
lower-priority jobs if doing so does not delay the expected initiation 
time of any  higher  priority  job.   Effectiveness of  backfill 
scheduling is dependent upon users specifying job time limits, otherwise 
all jobs will have the same time limit and backfilling is impossible.  
Note documentation for the SchedulerParameters option above.  This is 
the default configuration.


  sched/builtin
 This  is  the  FIFO scheduler which initiates jobs 
in priority order.  If any job in the partition can not be scheduled, no 
lower priority job in that partition will be scheduled.  An exception is 
made for jobs that can not run due to partition constraints (e.g. the 
time limit) or down/drained nodes.  In that case, lower priority jobs 
can be initiated and not impact the higher priority job.




Your partitions are set with maxtime=INFINITE, if your users are not 
specifying a reasonable timelimit to their jobs, this won't help either.



-b


On 4/24/20 1:52 PM, navin srivastava wrote:

In addition to the above when i see the sprio of both the jobs it says :-

for normal queue jobs all jobs showing the same priority

 JOBID PARTITION   PRIORITY  FAIRSHARE
        1291352 normal           15789      15789

for GPUsmall all jobs showing the same priority.

 JOBID PARTITION   PRIORITY  FAIRSHARE
        1291339 GPUsmall      21052      21053

On Fri, Apr 24, 2020 at 11:14 PM navin srivastava 
mailto:navin.alt...@gmail.com>> wrote:


Hi Team,

we are facing some issue in our environment. The resources are
free but job is going into the QUEUE state but not running.

i have attached the slurm.conf file here.

scenario:-

There are job only in the 2 partitions:
 344 jobs are in PD state in normal partition and the node belongs
from the normal partitions are full and no more job can run.

1300 JOBS are in GPUsmall partition are in queue and enough CPU is
avaiable to execute the jobs but i see the jobs are not
scheduling on free nodes.

Rest there are no pend jobs in any other partition .
eg:-
node status:- node18

NodeName=node18 Arch=x86_64 CoresPerSocket=18
   CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
   AvailableFeatures=K2200
   ActiveFeatures=K2200
   Gres=gpu:2
   NodeAddr=node18 NodeHostName=node18 Version=17.11
   OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17 07:44:50 UTC
2018 (0b375e4)
   RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=GPUsmall,pm_shared
   BootTime=2019-12-10T14:16:37 SlurmdStartTime=2019-12-10T14:24:08
   CfgTRES=cpu=36,mem=1M,billing=36
   AllocTRES=cpu=6
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

node19:-

NodeName=node19 Arch=x86_64 CoresPerSocket=18
   CPUAlloc=16 CPUErr=0 CPUTot=36 CPULoad=15.43
   AvailableFeatures=K2200
   ActiveFeatures=K2200
   Gres=gpu:2
   NodeAddr=node19 NodeHostName=node19 Version=17.11
   OS=Linux 4.12.14-94.41-default #1 SMP Wed Oct 31 12:25:04 UTC
2018 (3090901)
   RealMemory=1 AllocMem=0 FreeMem=63998 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=GPUsmall,pm_shared
   BootTime=2020-03-12T06:51:54 SlurmdStartTime=2020-03-12T06:53:14
   CfgTRES=cpu=36,mem=1M,billing=36
   AllocTRES=cpu=16
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

could you please help me to understand what could be the reason?

Re: [slurm-users] not allocating jobs even resources are free

2020-04-24 Thread navin srivastava

In addition to the above when i see the sprio of both the jobs it says :-

for normal queue jobs all jobs showing the same priority

 JOBID PARTITION   PRIORITY  FAIRSHARE
1291352 normal   15789  15789

for GPUsmall all jobs showing the same priority.

 JOBID PARTITION   PRIORITY  FAIRSHARE
1291339 GPUsmall  21052  21053

On Fri, Apr 24, 2020 at 11:14 PM navin srivastava 
wrote:

> Hi Team,
>
> we are facing some issue in our environment. The resources are free but
> job is going into the QUEUE state but not running.
>
> i have attached the slurm.conf file here.
>
> scenario:-
>
> There are job only in the 2 partitions:
>  344 jobs are in PD state in normal partition and the node belongs
> from the normal partitions are full and no more job can run.
>
> 1300 JOBS are in GPUsmall partition are in queue and enough CPU is
> avaiable to execute the jobs but i see the jobs are not scheduling on free
> nodes.
>
> Rest there are no pend jobs in any other partition .
> eg:-
> node status:- node18
>
> NodeName=node18 Arch=x86_64 CoresPerSocket=18
>CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
>AvailableFeatures=K2200
>ActiveFeatures=K2200
>Gres=gpu:2
>NodeAddr=node18 NodeHostName=node18 Version=17.11
>OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17 07:44:50 UTC 2018
> (0b375e4)
>RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2 Boards=1
>State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
>Partitions=GPUsmall,pm_shared
>BootTime=2019-12-10T14:16:37 SlurmdStartTime=2019-12-10T14:24:08
>CfgTRES=cpu=36,mem=1M,billing=36
>AllocTRES=cpu=6
>CapWatts=n/a
>CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>
> node19:-
>
> NodeName=node19 Arch=x86_64 CoresPerSocket=18
>CPUAlloc=16 CPUErr=0 CPUTot=36 CPULoad=15.43
>AvailableFeatures=K2200
>ActiveFeatures=K2200
>Gres=gpu:2
>NodeAddr=node19 NodeHostName=node19 Version=17.11
>OS=Linux 4.12.14-94.41-default #1 SMP Wed Oct 31 12:25:04 UTC 2018
> (3090901)
>RealMemory=1 AllocMem=0 FreeMem=63998 Sockets=2 Boards=1
>State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
>Partitions=GPUsmall,pm_shared
>BootTime=2020-03-12T06:51:54 SlurmdStartTime=2020-03-12T06:53:14
>CfgTRES=cpu=36,mem=1M,billing=36
>AllocTRES=cpu=16
>CapWatts=n/a
>CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>
> could you please help me to understand what could be the reason?
>
>
>
>
>
>
>
>
>
>

[slurm-users] not allocating jobs even resources are free

2020-04-24 Thread navin srivastava

Hi Team,

we are facing some issue in our environment. The resources are free but job
is going into the QUEUE state but not running.

i have attached the slurm.conf file here.

scenario:-

There are job only in the 2 partitions:
 344 jobs are in PD state in normal partition and the node belongs from the
normal partitions are full and no more job can run.

1300 JOBS are in GPUsmall partition are in queue and enough CPU is
avaiable to execute the jobs but i see the jobs are not scheduling on free
nodes.

Rest there are no pend jobs in any other partition .
eg:-
node status:- node18

NodeName=node18 Arch=x86_64 CoresPerSocket=18
   CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
   AvailableFeatures=K2200
   ActiveFeatures=K2200
   Gres=gpu:2
   NodeAddr=node18 NodeHostName=node18 Version=17.11
   OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17 07:44:50 UTC 2018
(0b375e4)
   RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=GPUsmall,pm_shared
   BootTime=2019-12-10T14:16:37 SlurmdStartTime=2019-12-10T14:24:08
   CfgTRES=cpu=36,mem=1M,billing=36
   AllocTRES=cpu=6
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

node19:-

NodeName=node19 Arch=x86_64 CoresPerSocket=18
   CPUAlloc=16 CPUErr=0 CPUTot=36 CPULoad=15.43
   AvailableFeatures=K2200
   ActiveFeatures=K2200
   Gres=gpu:2
   NodeAddr=node19 NodeHostName=node19 Version=17.11
   OS=Linux 4.12.14-94.41-default #1 SMP Wed Oct 31 12:25:04 UTC 2018
(3090901)
   RealMemory=1 AllocMem=0 FreeMem=63998 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=GPUsmall,pm_shared
   BootTime=2020-03-12T06:51:54 SlurmdStartTime=2020-03-12T06:53:14
   CfgTRES=cpu=36,mem=1M,billing=36
   AllocTRES=cpu=16
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

could you please help me to understand what could be the reason?
 cat /etc/slurm/slurm.conf
# slurm.conf file generated by configurator.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
#Running_config_start
#ControlMachine=node0
ControlMachine=slurmmaster
ControlAddr=192.168.150.21
AuthType=auth/munge
CryptoType=crypto/munge
CacheGroups=1
ReturnToService=0
ProctrackType=proctrack/linuxproc
SlurmctldPort=6817
SlurmdPort=6818
SchedulerPort=7321
SlurmctldPidFile=/var/slurm/slurmctld.pid
SlurmdPidFile=/var/slurm/slurmd.pid
SlurmdSpoolDir=/var/slurm/spool/slurmd.%n.spool
StateSaveLocation=/var/slurm/state
SlurmctldLogFile=/var/slurm/log/slurmctld.log
SlurmdLogFile=/var/slurm/log/slurmd.%n.log.%h
SlurmUser=hpcadmin
MpiDefault=none
SwitchType=switch/none
TaskPlugin=task/affinity
TaskPluginParam=Sched
SlurmctldTimeout=120
SlurmdTimeout=300
InactiveLimit=0
KillWait=30
MinJobAge=3600
FastSchedule=1
SchedulerType=sched/builtin
#SchedulerParameters=enable_user_top
SelectType=select/cons_res
#SelectTypeParameters=CR_Core_Memory
SelectTypeParameters=CR_Core
AccountingStorageEnforce=associations
AccountingStorageHost=155.250.126.30
AccountingStorageType=accounting_storage/slurmdbd
#AccountingStoreJobComment=YES
ClusterName=merckhpc
JobCompType=jobcomp/slurmdbd
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
SlurmctldDebug=5
SlurmdDebug=5
Waittime=0
#Running_config_end
#ControlAddr=
#BackupController=
#BackupAddr=
#
#CheckpointType=checkpoint/none
#DisableRootJobs=NO
#EnforcePartLimits=NO
Epilog=/etc/slurm/slurm.epilog.clean
#EpilogSlurmctld=
#FirstJobId=1
#MaxJobId=99
GresTypes=gpu
#GroupUpdateForce=0
#GroupUpdateTime=600
#JobCheckpointDir=/var/slurm/checkpoint
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
#JobFileAppend=0
#JobRequeue=1
#JobSubmitPlugins=1
#KillOnBadExit=0
#Licenses=foo*4,bar
#MailProg=/bin/mail
#MaxJobCount=5000
MaxJobCount=500
#MaxStepCount=4
#MaxTasksPerNode=128
#MpiParams=ports=#-#
#PluginDir=
#PlugStackConfig=
#PrivateData=jobs
#Prolog=
#PrologSlurmctld=
#PropagatePrioProcess=0
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#SallocDefaultCommand=
#SrunEpilog=
#SrunProlog=
#TaskEpilog=
#TaskProlog=
#TopologyPlugin=topology/tree
#TmpFs=/tmp
#TrackWCKey=no
#TreeWidth=
#UnkillableStepProgram=
#UsePAM=0
#UsePAM=0
#
#
# TIMERS
#BatchStartTimeout=10
#CompleteWait=0
#EpilogMsgTime=2000
#GetEnvTimeout=2
#HealthCheckInterval=0
#HealthCheckProgram=
MessageTimeout=100
#ResvOverRun=0
#OverTimeLimit=0
#UnkillableStepTimeout=60
#VSizeFactor=0
SchedulerParameters=enable_user_top,default_queue_depth=100
#
#
# SCHEDULING
#DefMemPerCPU=0
#MaxMemPerCPU=0
#SchedulerRootFilter=1
#SchedulerTimeSlice=30
#
#
# JOB PRIORITY
PriorityType=priority/multifactor
#PriortyFlags=Ticket_Based
#PriorityDecayHalfLife=1-0
PriorityDecayHalfLife=2
#PriorityCalcPeriod=
#PriorityFavorSmall=YES
#PriorityMaxAge=7-0
PriorityUsageResetPeriod=DAILY

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

Re: [slurm-users] not allocating jobs even resources are free

[slurm-users] not allocating jobs even resources are free

11 matches

Site Navigation

Mail list logo

Footer information