Our nodes have
Sockets=2 CoresPerSocket=10 ThreadsPerCore=2
CPUs are set to 40 and SelectTypeParameters=CR_CPU
According to this FAQ, this is "not a typical configuration".
Which is fine, I am aware that this is the set up - I did the configuration.
One of our users is complaining that we don't have the regular set up with
a "cpu=core" rather than a "cpu=thread".
In reality, a large majority of our users run non thread aware
bioinformatics software, so I am ok with our current set up - most of our
jobs are largely parallel.
Since changing to cpu=core setup would halve our resources, unless I could
convince or remind people to use srun --shared (according to
Alternatively OverSubscribe=FORCE would solve that problem of lazy/greedy
users, but doesn't that then override any advantage a hyperthreaded
software might have?
What are the nuances between the set up we have, and using OverSubscribe=X
with cpu=core (given that core=2 threads)?
Would there be a performance benefit from making cpu=core and halving the
available processes - would the halved number of processes run/finish
faster making up for the loss of flexibility?
The most dangerous phrase in the language is, "We've always done it this
- Grace Hopper