Wow; I should point out an amazing coincidence here. Doug Eadline
used [almost] exactly the same analogy that I did (truck vs. F1) in a
column that was published today in Linux Magazine:
http://www.linux-mag.com/id/7534
I swear I didn't read his column before I posted my answer
Jonathan Dursi wrote:
Continuing the conversation with myself:
Sorry to interrupt... :^)
Okay, I managed to reproduce the hang. I'll try to look at this.
Google pointed me to Trac ticket #1944, which spoke of deadlocks in
looped collective operations; there is no collective operation
On Wednesday 23 September 2009, Rahul Nabar wrote:
> On Tue, Aug 18, 2009 at 5:28 PM, Gerry Creager
wrote:
> > Most of that bandwidth is in marketing... Sorry, but it's not a high
> > performance switch.
>
> Well, how does one figure out what exactly is a "hih
On Sep 23, 2009, at 10:15 AM, Dave Love wrote:
So, how does one go about selecting a good switch? "The most
expensive
the better" is somewhat a unsatisfying option!
Also it's apparently not always right
+1 on Dave's and Joe's comments.
For example, not all of Cisco's switches are
Rahul Nabar writes:
> So, how does one go about selecting a good switch? "The most expensive
> the better" is somewhat a unsatisfying option!
Also it's apparently not always right, if I recall correctly, according
to the figures on MPI switch performance in the reports
How did you configure Open MPI? Is your application using SIGUSR1?
This error message indicates that Open MPI's daemons could not
communicate with the application processes. The daemons send SIGUSR1
to the process to initiate the handshake (you can change this signal
with -mca
Unfortunately I cannot provide a precise time frame for availability
at this point, but we are targeting the v1.5 release series. There is
a handful of core developers working on this issue at the moment.
Pieces of this work have already made it into the Open MPI
development trunk. If you
This is described in the C/R User's Guide attached to the webpage below:
https://svn.open-mpi.org/trac/ompi/wiki/ProcessFT_CR
Additionally this has been addressed on the users mailing list in the
past, so searching around will likely turn up some examples.
-- Josh
On Sep 18, 2009, at
Hi, Eugene:
If it continues to be a problem for people to reproduce this, I'll see
what can be done about having an account made here for someone to poke
around. Alternately, any suggestions for tests that I can do to help
diagnose/verify the problem, or figure out whats different about
Rahul Nabar wrote:
On Tue, Aug 18, 2009 at 5:28 PM, Gerry Creager wrote:
Most of that bandwidth is in marketing... Sorry, but it's not a high
performance switch.
Well, how does one figure out what exactly is a "hih performance
switch"? I've found this an exceedingly
Jonathan Dursi wrote:
Continuing the conversation with myself:
Google pointed me to Trac ticket #1944, which spoke of deadlocks in
looped collective operations; there is no collective operation
anywhere in this sample code, but trying one of the suggested
workarounds/clues: that is,
On Tue, Aug 18, 2009 at 5:28 PM, Gerry Creager wrote:
> Most of that bandwidth is in marketing... Sorry, but it's not a high
> performance switch.
Well, how does one figure out what exactly is a "hih performance
switch"? I've found this an exceedingly hard task. Like the
12 matches
Mail list logo