I attempted again and it gets succeed.
Thanks for your help.
On Thu, Apr 16, 2020 at 9:45 PM Ellestad, Erik
wrote:
> That all seems fine to me.
>
> I would check into your slurm logs to try and determine why slurm put your
> nodes into drain state.
>
> Erik
>
> ---
> Erik Ellestad
> Wynton
That all seems fine to me.
I would check into your slurm logs to try and determine why slurm put your
nodes into drain state.
Erik
---
Erik Ellestad
Wynton Cluster SysAdmin
UCSF
From: slurm-users on behalf of navin
srivastava
Sent: Wednesday, April 15, 2020
Thanks Erik.
Last night i made the changes.
i defined in slurm.conf on all the nodes as well as on the slurm server.
TmpFS=/lscratch
NodeName=node[01-10] CPUs=44 RealMemory=257380 Sockets=2
CoresPerSocket=22 ThreadsPerCore=1 TmpDisk=160 State=UNKNOWN
Feature=P4000 Gres=gpu:2
These
The default value for TmpDisk is 0, so if you want local scratch available on a
node, the amount of TmpDisk space must be defined in the node configuration in
slurm.conf.
example:
NodeName=TestNode01 CPUs=8 Boards=1 SocketsPerBoard=2 CoresPerSocket=4
ThreadsPerCore=1 RealMemory=24099
Thank you Erik.
To define the local scratch on all the compute node is not mandatory? only
on slurm server is enough right?
Also the TMPdisk should be defined in MB or can be defined in GB as well
while requesting --tmp , we can use the value in GB right?
Regards
Navin.
On Tue, Apr 14, 2020
Have you defined the TmpDisk value for each node?
As far as I know, local disk space is not a valid type for GRES.
https://slurm.schedmd.com/gres.html
"Generic resource (GRES) scheduling is supported through a flexible plugin
mechanism. Support is currently provided for Graphics Processing
Any suggestion on the above query.need help to understand it.
Does TmpFS=/scratch and the request is #SBATCH --tmp=500GB then it will
reserve the 500GB from scratch.
let me know if my assumption is correct?
Regards
Navin.
On Mon, Apr 13, 2020 at 11:10 AM navin srivastava
wrote:
> Hi Team,
Hi Team,
i wanted to define a mechanism to request the local disk space while
submitting the job.
we have dedicated /scratch of 1.2 TB file system for the execution of the
job on each of the compute nodes other than / and other file system.
i have defined in slurm.conf as TmpFS=/scratch and