On Tue, 16 Jun 2009, Matthew Knepley wrote: > On Tue, Jun 16, 2009 at 12:38 PM, xiaoyin ji <sapphire.jxy at gmail.com> > wrote: > > > Hi there, > > > > I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run > > obviously faster if I set the number of CPUs close to the number of > > computer nodes in the job file. By default MPIAIJ matrix is stored in > > different processors and ksp solver will communicate for each step, > > however since on each node several CPUs share the same memory while > > ksp may still try to communicate through network card, this may mess > > up a bit. Is there any way to detect which CPUs are sharing the same > > memory? Thanks a lot. > > > The interface for this is mpirun or the job submission mechanism.
One additional note: If you are scheduling multiple MPI jobs on the same machine [because its has multiple cores] - the reduced performance you notice could be due to 2 issues: * MPI not communicating optimally between the cores within the same node. For ex: mpich2-1 default install - i.e device=nemesis tries to be efficient for comunication between multiple cores within the node - as well as between nodes. [There could be similar configs for other MPI impls] * Within multi-core machines - the FPUs scale up with the number of cores, but the memory bandwidth does not scale up in the same linear way. Since achieved performance is a function of both - one should not expect linear speedup on multi-core machines. [What matters is the peak performance all the cores can collectively deliver] Satish
