Re: [slurm-users] ignore gpu resources to scheduled the cpu based jobs

2020-06-15 Thread navin srivastava
Thanks Renfro.

I will perform similar setting and let us see how it goes.

Regards

On Mon, Jun 15, 2020, 23:02 Renfro, Michael  wrote:

> So if a GPU job is submitted to a partition containing only GPU nodes, and
> a non-GPU job is submitted to a partition containing at least some nodes
> without GPUs, both jobs should be able to run. Priorities should be
> evaluated on a per-partition basis. I can 100% guarantee that in our HPC,
> pending GPU jobs don't block non-GPU jobs, and vice versa.
>
> I could see a problem if the GPU job was submitted to a partition
> containing both types of nodes: if that job was assigned the highest
> priority for whatever reason (fair share, age, etc.), other jobs in the
> same partition would have to wait until that job started.
>
> A simple solution would be to make a GPU partition containing only GPU
> nodes, and a non-GPU partition containing only non-GPU nodes. Submit GPU
> jobs to the GPU partition, and non-GPU jobs to the non-GPU partition.
>
> Once that works, you could make a partition that includes both types of
> nodes to reduce idle resources, but jobs submitted to that partition would
> have to (a) not require a GPU, (b) require a limited number of CPUs per
> node, so that you'd have some CPUs available for GPU jobs on the nodes
> containing GPUs.
>
> --
> *From:* slurm-users  on behalf of
> navin srivastava 
> *Sent:* Saturday, June 13, 2020 10:47 AM
> *To:* Slurm User Community List 
> *Subject:* Re: [slurm-users] ignore gpu resources to scheduled the cpu
> based jobs
>
>
> Yes we have separate partitions. Some are specific to gpu having 2 nodes
> with 8 gpu and another partitions are mix of both,nodes with 2 gpu and very
> few nodes are without any gpu.
>
> Regards
> Navin
>
>
> On Sat, Jun 13, 2020, 21:11 navin srivastava 
> wrote:
>
> Thanks Renfro.
>
> Yes we have both types of nodes with gpu and nongpu.
> Also some users job require gpu and some applications use only CPU.
>
> So the issue happens when user priority is high and waiting for gpu
> resources which is not available and the job with lower priority is waiting
> even though enough CPU is available which need only CPU resources.
>
> When I hold gpu  jobs the cpu  jobs will go through.
>
> Regards
> Navin
>
> On Sat, Jun 13, 2020, 20:37 Renfro, Michael  wrote:
>
> Will probably need more information to find a solution.
>
> To start, do you have separate partitions for GPU and non-GPU jobs? Do you
> have nodes without GPUs?
>
> On Jun 13, 2020, at 12:28 AM, navin srivastava 
> wrote:
>
> Hi All,
>
> In our environment we have GPU. so what i found is if the user having high
> priority and his job is in queue and waiting for the GPU resources which
> are almost full and not available. so the other user submitted the job
> which does not require the GPU resources are in queue even though lots of
> cpu resources are available.
>
> our scheduling mechanism is FIFO and Fair tree enabled. Is there any way
> we can make some changes so that the cpu based job should go through and
> GPU based job can wait till the GPU resources are free.
>
> Regards
> Navin.
>
>
>
>
>


[slurm-users] The Slurm User Group Meeting (SLUG'20) goes virtual this fall

2020-06-15 Thread Tim Wickberg
The Slurm User Group Meeting (SLUG'20) this fall will be moving online, 
and will not be hosted in-person at Harvard as scheduled, due to 
uncertainty around scheduling events caused by COVID-19.[1]


In lieu of an in-person meeting, SchedMD will broadcast a select set of 
presentations on Tuesday, September 15th, 2020. The agenda will be 
announced later this summer, but ask you to hold the date for this 
virtual SLUG meeting.


Due to this change in format and compressed schedule, we will not be 
issuing a call for presentations this year. We expect to resume a normal 
program for SLUG'21.


- Tim

[1] For those who previously registered, all registrations were 
cancelled and refunded as of this afternoon. Please check your inbox for 
receipts from Eventbrite.


--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



Re: [slurm-users] ignore gpu resources to scheduled the cpu based jobs

2020-06-15 Thread Renfro, Michael
So if a GPU job is submitted to a partition containing only GPU nodes, and a 
non-GPU job is submitted to a partition containing at least some nodes without 
GPUs, both jobs should be able to run. Priorities should be evaluated on a 
per-partition basis. I can 100% guarantee that in our HPC, pending GPU jobs 
don't block non-GPU jobs, and vice versa.

I could see a problem if the GPU job was submitted to a partition containing 
both types of nodes: if that job was assigned the highest priority for whatever 
reason (fair share, age, etc.), other jobs in the same partition would have to 
wait until that job started.

A simple solution would be to make a GPU partition containing only GPU nodes, 
and a non-GPU partition containing only non-GPU nodes. Submit GPU jobs to the 
GPU partition, and non-GPU jobs to the non-GPU partition.

Once that works, you could make a partition that includes both types of nodes 
to reduce idle resources, but jobs submitted to that partition would have to 
(a) not require a GPU, (b) require a limited number of CPUs per node, so that 
you'd have some CPUs available for GPU jobs on the nodes containing GPUs.


From: slurm-users  on behalf of navin 
srivastava 
Sent: Saturday, June 13, 2020 10:47 AM
To: Slurm User Community List 
Subject: Re: [slurm-users] ignore gpu resources to scheduled the cpu based jobs


Yes we have separate partitions. Some are specific to gpu having 2 nodes with 8 
gpu and another partitions are mix of both,nodes with 2 gpu and very few nodes 
are without any gpu.

Regards
Navin


On Sat, Jun 13, 2020, 21:11 navin srivastava 
mailto:navin.alt...@gmail.com>> wrote:
Thanks Renfro.

Yes we have both types of nodes with gpu and nongpu.
Also some users job require gpu and some applications use only CPU.

So the issue happens when user priority is high and waiting for gpu resources 
which is not available and the job with lower priority is waiting even though 
enough CPU is available which need only CPU resources.

When I hold gpu  jobs the cpu  jobs will go through.

Regards
Navin

On Sat, Jun 13, 2020, 20:37 Renfro, Michael 
mailto:ren...@tntech.edu>> wrote:
Will probably need more information to find a solution.

To start, do you have separate partitions for GPU and non-GPU jobs? Do you have 
nodes without GPUs?

On Jun 13, 2020, at 12:28 AM, navin srivastava 
mailto:navin.alt...@gmail.com>> wrote:

Hi All,

In our environment we have GPU. so what i found is if the user having high 
priority and his job is in queue and waiting for the GPU resources which are 
almost full and not available. so the other user submitted the job which does 
not require the GPU resources are in queue even though lots of cpu resources 
are available.

our scheduling mechanism is FIFO and Fair tree enabled. Is there any way we can 
make some changes so that the cpu based job should go through and GPU based job 
can wait till the GPU resources are free.

Regards
Navin.






Re: [slurm-users] How Nice value is decided by slurm.

2020-06-15 Thread Rodrigo Santibáñez
Hi,

As far I understand, the nice value is set by the user. The scheduler
modifies directly the priority value. You could use the command 'sprio' to
find out the contribution of the different parameters to the final priority.

Best regards

El lun., 15 de junio de 2020 12:50, navin srivastava 
escribió:

> Hi,
>
> One query about how nice value will be decided by the scheduler.
>
> our scheduling policy id FIFO + Fair tree.
>
> one user submitted 100 of jobs in different dates. what i see is the old
> jobs are in queue but few latest job went for the execution. when i see the
> nice value of the latest running job is 0 but rest are having 2.
>
> user is not passing any nice value with the job submission then how slurm
> decided and take this job for the execution.
>
> Regards
> Navin.
>
>