Hi all, I'm having issues with slurm 2.2.7 and specifying the nodes cpu information.
If I set the number of sockets, core per socket and thread per core like this: > NodeName=node[2-4] RealMemory=23000 Sockets=2 CoresPerSocket=4 > ThreadsPerCore=2 State=UNKNOWN > >> and submit a job, slurmctl crashes. The last section of sclurmctl.log is: > [2011-08-02T17:58:50] debug2: initial priority for job 49852 is 98 > [2011-08-02T17:58:50] debug2: found 3 usable nodes from config containing > node[2-4] > [2011-08-02T17:58:50] debug3: _pick_best_nodes: job 49852 idle_nodes 65 > share_nodes 76 > [2011-08-02T17:58:50] debug2: sched: JobId=49852 allocated resources: > NodeList=(null) > [2011-08-02T17:58:50] _slurm_rpc_submit_batch_job JobId=49852 usec=1540 > [2011-08-02T17:58:50] debug: sched: Running job scheduler > [2011-08-02T17:58:50] debug2: found 3 usable nodes from config containing > node[2-4] > [2011-08-02T17:58:50] debug3: _pick_best_nodes: job 49852 idle_nodes 65 > share_nodes 76 > [2011-08-02T17:58:50] fatal: cons_res: sync loop not progressing > I've also seen the error "cons_res: cpus computation error". There might be something wrong with my configuration, but slurm should tell me so, not crash when a job is submitted... I'm playing with these options because a user reported that just using Procs=16 would not spread his mpi processes accross the allocated nodes. I've fixed that by using --nodes=*-* and --ntasks-per-node=*, but the crash is still relevant I guess... Could it be a bug? Thanks Nicolas
