This should help:
http://slurm.schedmd.com/big_sys.html
Quoting Sergio Iserte Agut <[email protected]>:
Thank you! It was very helpful!
Now, I've continued testing and I've found other limitation:
$ sudo srun -N584 printenv | egrep 'SLURM_NODELIST'
slurmd[node0]: exec_wait_info_create: pipe: Too many open files
slurmd[node0]: child fork: Too many open files
srun: error: task 0 launch failed: Slurmd could not execve job
Maybe it's a machine limitation, but I'm not sure. Any idea?
Regards!
2013/10/22 Moe Jette <[email protected]>
See MaxTasksPerNode in slurm.conf man page
Quoting Sergio Iserte Agut <[email protected]>:
Hello everybody,
I've been trying the front-end mode in order to simulate more resources
than I really have.
I've configured my SLURM 2.6.2 whith these lines:
NodeName=dummy[1-1200] NodeHostName=node0 NodeAddr=10.0.0.1
PartitionName=debug Nodes=dummy[1-1200] Default=YES MaxTime=INFINITE
State=UP
Notice that node0 is the node where slurmctld and slumrd are running.
When I try to execute:
sudo srun -Nx hostname
Where x is greater than 128 I get:
srun: error: Unable to create job step: Task count specification invalid
srun: Force Terminated job 688
While when x is less than or equal 128 the execution is OK.
Why can not I use more nodes?
Regards.
--
*Sergio Iserte Agut, research assistant,*
*High Performance Computing & Architecture*
*University Jaume I (Castellón, Spain)*
--
*Sergio Iserte Agut, research assistant,*
*High Performance Computing & Architecture*
*University Jaume I (Castellón, Spain)*