Re: [OMPI users] Proper way to throw an error to all nodes?

2008-08-18 Thread Robert Kubrick
A question related to an old thread: in case of solution 2), how do you broadcast 'flags' to the slaves if they're processing asynchronous data? I understand MPI_Bcast is a collective operation requiring all processes in a communicator to call it before it completes. If the slaves are proces

Re: [OMPI users] MPI_ERR_TRUNCATE with MPI_Revc without Infinipath

2008-08-18 Thread George Bosilca
Tom, This make perfect sense. However, the fact that one of the network devices (BTL in Open MPi terms) is not available at runtime should not modify the behavior of the application. At least this is the theory :) Changing from named receives to unnamed one, definitively modify the signat

Re: [OMPI users] MPI_ERR_TRUNCATE with MPI_Revc without Infinipath

2008-08-18 Thread Tom Riddle
Thanks George, I will update and try the latest repo. However I'd like to describe our usage case a bit more to see if there is something that may not be proper in our development approach. Forgive me if this is repetitious... We have configured and built OpenMPI originally on a machine with Inf

Re: [OMPI users] TCP Bandwidth

2008-08-18 Thread George Bosilca
Unfortunately, I can hardly imagine where the performance problems are coming from. Usually I get more than 97% out of the raw TCP performance with Open MPI. There are two parameters hat can slightly improve the behavior: btl_tcp_rdma_pipeline_send_length and btl_tcp_min_rdma_pipeline_size.

Re: [OMPI users] Continuous poll/select using btl sm (svn 1.4a1r18899)

2008-08-18 Thread Mostyn Lewis
George, I'm glad you changed the scheduling and my program seems to work. Thank you. However, to stress it a bit more I changed #define NUM_ITERS 1000 to #define NUM_ITERS 10 and it glues up at around ~30k Please try it and see. Regads, Mostyn Mostyn, There was a problem with the SM

Re: [OMPI users] memory leak in alltoallw

2008-08-18 Thread Dave Grote
Great! Thanks for the fix.    Dave Tim Mattox wrote: The fix for this bug is in the 1.2 branch as of r19360, and will be in the upcoming 1.2.7 release. On Sun, Aug 17, 2008 at 6:10 PM, George Bosilca wrote: Dave, Thanks for your report. As you discovered we had a memory leak i

Re: [OMPI users] TCP Bandwidth

2008-08-18 Thread Steve Wise
Andy Georgi wrote: Steve Wise wrote: Are you using Chelsio's TOE drivers? Or just a driver from the distro? We use the Chelsio TOE drivers. Steve Wise wrote: Ok. Did you run their perftune.sh script? Yes, if not we wouldn't get the 1.15 GB/s on the TCP level. We had ~800 MB/s before p

Re: [OMPI users] TCP Bandwidth

2008-08-18 Thread Steve Wise
Jon Mason wrote: On Mon, Aug 18, 2008 at 10:00:24AM +0200, Andy Georgi wrote: Steve Wise wrote: Are you using Chelsio's TOE drivers? Or just a driver from the distro? We use the Chelsio TOE drivers. Steve Wise wrote: Ok. Did you run their perftune.sh script? T

Re: [OMPI users] TCP Bandwidth

2008-08-18 Thread Jon Mason
On Mon, Aug 18, 2008 at 10:00:24AM +0200, Andy Georgi wrote: > Steve Wise wrote: >> Are you using Chelsio's TOE drivers? Or just a driver from the distro? > > We use the Chelsio TOE drivers. > > > Steve Wise wrote: >> Ok. Did you run their perftune.sh script? That script should optimally tune yo

Re: [OMPI users] memory leak in alltoallw

2008-08-18 Thread Tim Mattox
The fix for this bug is in the 1.2 branch as of r19360, and will be in the upcoming 1.2.7 release. On Sun, Aug 17, 2008 at 6:10 PM, George Bosilca wrote: > Dave, > > Thanks for your report. As you discovered we had a memory leak in the > MPI_Alltoallw. A very small one, but it was there. Basicall

Re: [OMPI users] Q: OpenMPI's use of /tmp and hanging apps via FS problems?

2008-08-18 Thread Ralph Castain
Hi Brian On Aug 16, 2008, at 1:40 PM, Brian Dobbins wrote: Hi guys, I was hoping someone here could shed some light on OpenMPI's use of /tmp (or, I guess, TMPDIR) and save me from diving into the source.. ;) The background is that I'm trying to run some applications on a system whi

Re: [OMPI users] problem with alltoall with ppn=8

2008-08-18 Thread Rolf Vandevaart
Ashley Pittman wrote: On Sat, 2008-08-16 at 08:03 -0400, Jeff Squyres wrote: - large all to all operations are very stressful on the network, even if you have very low latency / high bandwidth networking such as DDR IB - if you only have 1 IB HCA in a machine with 8 cores, the problem becom

Re: [OMPI users] TCP Bandwidth

2008-08-18 Thread Andy Georgi
Steve Wise wrote: Are you using Chelsio's TOE drivers? Or just a driver from the distro? We use the Chelsio TOE drivers. Steve Wise wrote: Ok. Did you run their perftune.sh script? Yes, if not we wouldn't get the 1.15 GB/s on the TCP level. We had ~800 MB/s before primarily because of t