On 2016-06-03 21:25, Jason Bacon wrote: > It might be worth mentioning that the calcpi-parallel jobs are run with > --array (no srun). > > Disabling the task/affinity plugin and using "mpirun --bind-to core" > works around the issue. The MPI processes bind to specific cores and > the embarrassingly parallel jobs kindly move over and stay out of the way.
Are the mpirun --bind-to core child processes the same as a slurm task? I have no experience at all with MPI jobs -- just trying to understand task/affinity and params. As far as I understand when you let mpirun do the binding it handles the binding different https://www.open-mpi.org/doc/v1.8/man1/mpirun.1.php If I grok the % mpirun ... --map-by core --bind-to core example in the "Mapping, Ranking, and Binding: Oh My!" section right. > On 06/03/16 10:18, Jason Bacon wrote: >> >> We're having an issue with CPU binding when two jobs land on the same >> node. >> >> Some cores are shared by the 2 jobs while others are left idle. Below [...] >> TaskPluginParam=cores,verbose don't you bind each _job_ to a single core because you override automatic binding and thous prevent binding each child process to different core? Regards, Benjamin -- FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html vox: +49 3641 9 44323 | fax: +49 3641 9 44321
