[OMPI users] Bad performance when scattering big size of data?

2010-10-04 Thread Storm Zhang
We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs. So we have 1024 compute units shown in the ROCKS 5.3 system. I'm trying to scatter an array from the master node to the compute nodes using mpiCC and mpirun using C++. Here is my test: The array size is 18KB * Number of

Re: [OMPI users] Bad performance when scattering big size of data?

2010-10-04 Thread Storm Zhang
prised that you see a performance hit > when requesting > 512 compute units. We should really get input from a > hyperthreading expert, preferably form intel. > > Doug Reeder > On Oct 4, 2010, at 9:53 AM, Storm Zhang wrote: > > We have 64 compute nodes which are dual qual-co

Re: [OMPI users] Bad performance when scattering big size of data?

2010-10-04 Thread Storm Zhang
equesting > 512 compute units. We should really get input from a > hyperthreading expert, preferably form intel. > > Doug Reeder > On Oct 4, 2010, at 9:53 AM, Storm Zhang wrote: > > We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs. > So we hav

Re: [OMPI users] Bad performance when scattering big size of data?

2010-10-04 Thread Storm Zhang
ude eth0 -np 600 -bind-to-core scatttest Thank you very much. Linbao On Mon, Oct 4, 2010 at 4:42 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Oct 4, 2010, at 1:48 PM, Storm Zhang wrote: > > Thanks a lot, Ralgh. As I said, I also tried to use SGE(also showing 1024 >

Re: [OMPI users] Bad performance when scattering big size of data?

2010-10-05 Thread Storm Zhang
but not find the bind-to-core info. I only see bynode or byslot options. Is it same as bind-to-core? My mpirun shows version 1.3.3 but ompi_info shows 1.4.2. Thanks a lot. Linbao On Mon, Oct 4, 2010 at 9:18 PM, Eugene Loh <eugene@oracle.com> wrote: > Storm Zhang wrote: > > >>

[OMPI users] Question about MPI_Barrier

2010-10-20 Thread Storm Zhang
Dear all, I got confused with my recent C++ MPI program's behavior. I have an MPI program in which I use clock() to measure the time spent between to MPI_Barrier, just like this: MPI::COMM_WORLD.Barrier(); if if(rank == master) t1 = clock(); "code A"; MPI::COMM_WORLD.Barrier(); if if(rank ==

Re: [OMPI users] Question about MPI_Barrier

2010-10-21 Thread Storm Zhang
ould use MPI_Wtime(), not > clock() > > regards > jody > > On Wed, Oct 20, 2010 at 11:51 PM, Storm Zhang <storm...@gmail.com> wrote: > > Dear all, > > > > I got confused with my recent C++ MPI program's behavior. I have an MPI > > program in which

Re: [OMPI users] Question about MPI_Barrier

2010-10-21 Thread Storm Zhang
probably* has no effect on time spent between t1 and t2. But > extraneous effects might cause it to do so -- e.g., are you running in an > oversubscribed scenario? And so on. > No. We have 1024 nodes available and I'm using 500. > > On Oct 21, 2010, at 9:24 AM, Storm Zhang wrote: >

Re: [OMPI users] Question about MPI_Barrier

2010-10-21 Thread Storm Zhang
Hi, Eugene, You said: " The bottom line here is that from a causal point of view it would seem that B should not impact the timings. Presumably, some other variable is actually responsible here." Could you explain it in more details for the second sentence. Thanks a lot. Linbao On Thu, Oct 21,

Re: [OMPI users] Question about MPI_Barrier

2010-10-21 Thread Storm Zhang
have seen cases where the > presence or absence of code that isn't executed can influence timings > (perhaps because code will come out of the instruction cache differently), > but all that is speculation. It's all a guess that what you're really > seeing isn't really MPI related at a