Hi all,

My name is Arnau Bria and I work as a sysadmin at PIC (a data center in
Barcelona). We have a cluster of ~300 nodes and 3300 job slots under
torque/maui. Our current scenario, more than 6k jobs, causes serious
problems to torque/maui, so we're studying alternatives and seems that
slurm has same/more torque/maui features and it scales much better.

So, I'm staring to read some slurm docs and I've been able to install a
server, configure some partitions, a copule of nodes and send some
jobs. I've learned some basic command to manage
partions/nodes/queues/jobs.
 

Now I'd like to start a deeper investigation and I'm trying to
"import" torque's configuration into slurm, and see what still has
sense and what not:

1.-) from: https://computing.llnl.gov/linux/slurm/faq.html#fast_schedule
How can I configure SLURM to use the resources actually found
on a node rather than what is defined in slurm.conf? 

All my nodes (which have 4 cpus) show only 1 cpu. I can't make slurm to
guess node resources automatically. This is my conf:
[...]
SelectType=select/cons_res
SelectTypeParameters=CR_CPU
FastSchedule=0
[...]
NodeName=DEFAULT State=UNKNOWN
NodeName=tditaller002.pic.es,tditaller005.pic.es

node log:
[...]
Nov  3 14:40:20 tditaller002 slurmd[8245]: slurmd version 2.3.1 started
Nov  3 14:40:20 tditaller002 slurmd[8245]: slurmd started on Thu 03 Nov 2011 
14:40:20 +0100
Nov  3 14:40:20 tditaller002 slurmd[8245]: Procs=1 Sockets=1 Cores=1 Threads=1 
Memory=7985 TmpDisk=1990 Uptime=98838


2.-) CPU_factor
In torque we define cpu_factor. A way to normalize cpu_time between two
differnet hosts. (host A is good, host B bad. So, 1 second in host A
equals to 2 in host B). 

Is this configurable in slurm? what name do you use for that?¿

3.-) max node load.
May I configure a max amount of load in a node? i.e a node with 4 cpus
will run 4 jobs, but if running 3 it reaches some load I'd like slurm
to NOT send more jobs to that node.


4.-) how is the file copy between client/server done? (input/output)?
ssh? NFS= is it configurable?


Well, I think I've asked enough questions for my first mail :-) 
Could anyone answer some (or all) this questions? Coudl anyone send me
a link to presentations/wiki/extended_doc?

Many thanks in advance,
Cheers,
Arnau

Reply via email to