Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Randolph Pullen
Interesting point. --- On Thu, 12/8/10, Ashley Pittman wrote: From: Ashley Pittman Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Thursday, 12 August, 2010, 12:22 AM On 11 Aug 2010, at 05:10, Randolph

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Randolph Pullen
I (a single user) am running N separate MPI  applications doing 1 to N broadcasts over PVM, each MPI application is started on each machine simultaneously by PVM - the reasons are back in the post history. The problem is that they somehow collide - yes I know this should not happen, the

Re: [OMPI users] Hyper-thread architecture effect on MPI jobs

2010-08-11 Thread Eugene Loh
The way MPI processes are being assigned to hardware threads is perhaps neither controlled nor optimal.  On the HT nodes, two processes may end up sharing the same core, with poorer performance. Try submitting your job like this % cat myrankfile1 rank  0=os223 slot=0 rank  1=os221 slot=0

Re: [OMPI users] Hyper-thread architecture effect on MPI jobs

2010-08-11 Thread Gus Correa
Hi Saygin You could: 1) turn off hyperthreading (on BIOS), or 2) use the mpirun options (you didn't send your mpirun command) to distribute the processes across the nodes, cores, etc. "man mpirun" is a good resource, see the explanations about the -byslot, -bynode, -loadbalance options. 3) In

Re: [OMPI users] Hyper-thread architecture effect on MPI jobs

2010-08-11 Thread pooja varshneya
Saygin, You can use mpstat tool to see the load on each core at runtime. Do you know exactly which particular calls are taking longer time ? You can run just those two computations (one at a time) on a different machine and check if the other machines have similar or lesser computation time. -

[OMPI users] Hyper-thread architecture effect on MPI jobs

2010-08-11 Thread Saygin Arkan
Hello, I'm running mpi jobs in non-homogeneous cluster. 4 of my machines have the following properties, os221, os222, os223, os224: vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping: 7 cache

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Ashley Pittman
On 11 Aug 2010, at 05:10, Randolph Pullen wrote: > Sure, but broadcasts are faster - less reliable apparently, but much faster > for large clusters. Going off-topic here but I think it's worth saying: If you have a dataset that requires collective communication then use the function call

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Jeff Squyres
On Aug 11, 2010, at 12:10 AM, Randolph Pullen wrote: > Sure, but broadcasts are faster - less reliable apparently, but much faster > for large clusters. Just to be totally clear: MPI_BCAST is defined to be "reliable", in the sense that it will complete or invoke an error (vs. unreliable data

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Jeff Squyres
On Aug 11, 2010, at 9:54 AM, Jeff Squyres wrote: > (I'll say that OMPI's ALLGATHER algorithm is probably not well optimized for > massive data transfers like you describe) Wrong wrong wrong -- I should have checked the code before sending. I made the incorrect assumption that OMPI still only

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Jeff Squyres
On Aug 10, 2010, at 10:09 PM, Randolph Pullen wrote: > Jeff thanks for the clarification, > What I am trying to do is run N concurrent copies of a 1 to N data movement > program to affect an N to N solution. The actual mechanism I am using is to > spawn N copies of mpirun from PVM across the

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Richard Treumann
Randolf I am confused about using multiple, concurrent mpirun operations. If there are M uses of mpirun and each starts N tasks (carried out under pvm or any other way) I would expect you to have M completely independent MPI jobs with N tasks (processes) each. You could have some root in

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Randolph Pullen
Sure, but broadcasts are faster - less reliable apparently, but much faster for large clusters.  Jeff says that all OpenMPI calls are implemented with point to point B-tree style communications of log N transmissions So I guess that altoall would be N log N --- On Wed, 11/8/10, Terry Frankcombe

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Terry Frankcombe
On Tue, 2010-08-10 at 19:09 -0700, Randolph Pullen wrote: > Jeff thanks for the clarification, > What I am trying to do is run N concurrent copies of a 1 to N data > movement program to affect an N to N solution. I'm no MPI guru, nor do I completely understand what you are doing, but isn't this