Your Intel processors are I assume not the new Nehalem/I7 ones? The older 
quad-core ones are seriously memory bandwidth limited when running a memory 
intensive application. That might explain why using all 8 cores per node 
slows down your calculation.

Why do you get such a difference between cpu time and elapsed time? Is your 
code doing any file IO or maybe waiting for one of the processors? Do you use 
non-blocking communication wherever possible?

Regards,

Mattijs

On Wednesday 04 March 2009 05:46, Sangamesh B wrote:
> Hi all,
>
> Now LAM-MPI is also installed and tested the fortran application by
> running with LAM-MPI.
>
> But LAM-MPI is performing still worse than Open MPI
>
> No of nodes:3 cores per node:8  total core: 3*8=24
>
>        CPU TIME :    1 HOURS 51 MINUTES 23.49 SECONDS
>    ELAPSED TIME :    7 HOURS 28 MINUTES  2.23 SECONDS
>
> No of nodes:6  cores used per node:4  total core: 6*4=24
>
>        CPU TIME :    0 HOURS 51 MINUTES 50.41 SECONDS
>    ELAPSED TIME :    6 HOURS  6 MINUTES 38.67 SECONDS
>
> Any help/suggetsions to diagnose this problem.
>
> Thanks,
> Sangamesh
>
> On Wed, Feb 25, 2009 at 12:51 PM, Sangamesh B <forum....@gmail.com> wrote:
> > Dear All,
> >
> >    A fortran application is installed with Open MPI-1.3 + Intel
> > compilers on a Rocks-4.3 cluster with Intel Xeon Dual socket Quad core
> > processor @ 3GHz (8cores/node).
> >
> >    The time consumed for different tests over a Gigabit connected
> > nodes are as follows: (Each node has 8 GB memory).
> >
> > No of Nodes used:6  No of cores used/node:4 total mpi processes:24
> >       CPU TIME :    1 HOURS 19 MINUTES 14.39 SECONDS
> >   ELAPSED TIME :    2 HOURS 41 MINUTES  8.55 SECONDS
> >
> > No of Nodes used:6  No of cores used/node:8 total mpi processes:48
> >       CPU TIME :    4 HOURS 19 MINUTES 19.29 SECONDS
> >   ELAPSED TIME :    9 HOURS 15 MINUTES 46.39 SECONDS
> >
> > No of Nodes used:3  No of cores used/node:8 total mpi processes:24
> >       CPU TIME :    2 HOURS 41 MINUTES 27.98 SECONDS
> >   ELAPSED TIME :    4 HOURS 21 MINUTES  0.24 SECONDS
> >
> > But the same application performs well on another Linux cluster with
> > LAM-MPI-7.1.3
> >
> > No of Nodes used:6  No of cores used/node:4 total mpi processes:24
> > CPU TIME :    1hours:30min:37.25s
> > ELAPSED TIME  1hours:51min:10.00S
> >
> > No of Nodes used:12  No of cores used/node:4 total mpi processes:48
> > CPU TIME :    0hours:46min:13.98s
> > ELAPSED TIME  1hours:02min:26.11s
> >
> > No of Nodes used:6  No of cores used/node:8 total mpi processes:48
> > CPU TIME :     1hours:13min:09.17s
> > ELAPSED TIME  1hours:47min:14.04s
> >
> > So there is a huge difference between CPU TIME & ELAPSED TIME for Open
> > MPI jobs.
> >
> > Note: On the same cluster Open MPI gives better performance for
> > inifiniband nodes.
> >
> > What could be the problem for Open MPI over Gigabit?
> > Any flags need to be used?
> > Or is it not that good to use Open MPI on Gigabit?
> >
> > Thanks,
> > Sangamesh
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 

Mattijs Janssens

OpenCFD Ltd.
9 Albert Road,
Caversham,
Reading RG4 7AN.
Tel: +44 (0)118 9471030
Email: m.janss...@opencfd.co.uk
URL: http://www.OpenCFD.co.uk

Reply via email to