Hi SLURM users,

Software compiled against the MKL, like R or Python with NumPy/SciPy compiled 
against MKL, or probably many other examples present a problem for someone who 
is making choices via the scheduler which then the software does not respect. 
Our most recent example is that someone is running 24 R tasks and doing actions 
that are ultimately using OpenMP via MKL, which is causing each of those 24 
tasks to think it should run 24 threads.

I see that Berkeley recommends to users doing "export 
MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK” in their job script, to prevent 
unexpected behavior; it’s a pretty good idea. Do any other sites have clever 
solutions to this that they can share (maybe even settings some of these 
variables on every job to match the $SLURM_CPUS_PER_TASK variable via a plugin 
or something)?

I’m also not quite sure what the best way to handle the fact that some of these 
jobs will be serial for part and parallel for others and to do it without 
wasting CPUs, but maybe some of you have figured that one out too.

Thanks in advance for any hints.

|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB C630, Newark

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to