Hi SLURM users, Software compiled against the MKL, like R or Python with NumPy/SciPy compiled against MKL, or probably many other examples present a problem for someone who is making choices via the scheduler which then the software does not respect. Our most recent example is that someone is running 24 R tasks and doing actions that are ultimately using OpenMP via MKL, which is causing each of those 24 tasks to think it should run 24 threads.
I see that Berkeley recommends to users doing "export MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK” in their job script, to prevent unexpected behavior; it’s a pretty good idea. Do any other sites have clever solutions to this that they can share (maybe even settings some of these variables on every job to match the $SLURM_CPUS_PER_TASK variable via a plugin or something)? I’m also not quite sure what the best way to handle the fact that some of these jobs will be serial for part and parallel for others and to do it without wasting CPUs, but maybe some of you have figured that one out too. Thanks in advance for any hints. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `'
Description: Message signed with OpenPGP