I could also send you my mpi/numactl command lines for gpu and cpu when I am back in the office.
Dave Nystrom writes: > Jed Brown writes: > > On Thu, Feb 23, 2012 at 18:53, Nystrom, William D <wdn at lanl.gov> wrote: > > > > > Rerunning the CPU case with numactl results in a 25x speedup and > > > log_summary > > > results that look reasonable to me now. > > > > > > > What command are you using for this? We usually use the affinity options > to > > mpiexec instead of using numactl/taskset manually. > > I was using openmpi-1.5.4 as installed by the system admins on our testbed > cluster. I talked to a couple of our openmpi developers and they indicated > that the affinity stuff was broken in that version but should be fixed when > 1.5.5 and 1.6 come out - which should be within the next month. > > I also tried mvapich2-1.7 built with slurm and tried using the affinity stuff > with srun. That also did not seem to work. But I should probably revisit > that and try to make sure that I really understand how to use srun. > > I was pretty surprised that getting the numa stuff right made such a huge > difference. I'm also wondering if getting the affinity right will make much > of a difference for the gpu case. > > > Did you also set a specific memory policy? > > I'm not sure what you mean by the above question but I'm kind of new to all > this numa stuff. > > > Which Linux kernel is this? > > The OS was the latest beta of TOSS2. If I remember, I can check next time I > am in my office. It is probably RHEL6.
