Christopher Benjamin Coffey <chris.cof...@nau.edu> writes:
> Hi Loris,
>> But that's only the case if the program is started with srun or some
>> form of mpirun. Otherwise the program just gets started once on one
>> core and the other cores just idle.
> Yes, maybe that’s true about what you say when not using srun. I'm not
> sure, as we tell everyone to use srun to launch every type of task.
OK, I'm confused now. Our main culprit for producing processes with
incorrect affinity is ORCA . It uses OpenMPI but also likes to start
processes asynchronously via SSH within the node set. Our users run
their jobs via batch files containing, say
However, if I run an ORCA job with 'srun', i.e.
srun $ORCA_PATH/orca ...
this results in the program being run 8 times with all of them writing
to the same log and output files.
Is ORCA just a pathological exception to the idea that it's always good
to use 'srun'? (As it causes well over 95% of our affinity problems, it
is already pathological in that sense.)
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de