I'm hoping this is just user error...

I'm running a single-node job with a node that has two dual-core opterons (Open MPI 1.0.2).
compiler=gcc 4.1.0
arch=x86_64 (64-bit)
OS=linux 2.6.16

****
My machine file looked like this:

node1 slots=4

I have an HPL configuration for 4 processors (PxQ=2x2)

I started with 'mpirun -np 4 -machinefile foo ./xhpl'

And the problem takes 15 seconds to complete.

I change the machinefile to read:

node1 slots=2
-or, simply-
node1

It doesn't matter which machinefile I use; I still execute it with:
'mpirun -np 4 -machinefile foo ./xhpl'

Except now the problem takes 0.1 sec to complete.

It's perfectly repeatable...

Is there something about the machine file format I'm not aware of (with respect to dual-core CPUs)? IIRC, slots=(num of processes to run per node); so two dual-cores should be slots=4. Except 'slots=4' makes it run a few orders of magnitude slower.

Thoughts?
--
Troy Telford

Reply via email to