Re: [OMPI users] GM + OpenMPI bug ...

2010-05-21 Thread Patrick Geoffray
Hi Jose, On 5/21/2010 6:54 AM, José Ignacio Aliaga Estellés wrote: We have used the lspci -vvxxx and we have obtained: bi00: 04:01.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet Controller (Copper) (rev 02) This is the output for the Intel GigE NIC, you should look at the

Re: [OMPI users] GM + OpenMPI bug ...

2010-05-20 Thread Patrick Geoffray
to reset each NIC in its PCI slot, or use a different slot if available. Hope it helps. Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com

Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-08-18 Thread Patrick Geoffray
Craig Plaisance wrote: So is this a problem with the physical switch (we need a better switch) or with the configuration of the switch (we need to configure the switch or configure the os to work with the switch)? You may want to look if you are dropping packets somewhere. You can look at

Re: [OMPI users] How to make a job abort when one host dies?

2009-08-18 Thread Patrick Geoffray
to terminate the job when a send timeout has occurred. We will implement this mechanism and push it on the trunk shortly. Thanks Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com

Re: [OMPI users] ga-4.1 on mx segmentation violation

2008-10-22 Thread Patrick Geoffray
SLIM H.A. wrote: I have built the release candidate for ga-4.1 with OpenMPI 1.2.3 and portland compilers 7.0.2 for Myrinet mx. Which version of ARMCI and MX ? ARMCI configured for 3 cluster nodes. Network protocol is 'MPI-SPAWN'. 0:Segmentation Violation error, status=: 11 0:ARMCI DASSERT

Re: [OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-06 Thread Patrick Geoffray
SLIM H.A. wrote: I would be grateful for any advice Just to check, you are not using the MTL for MX, right ? Only the BTL interface allow to choose between several devices at run time. Patrick

Re: [OMPI users] tg3 module

2008-06-04 Thread Patrick Geoffray
Hi Leonardo, Leonardo Fialho wrote: NETDEV WATCHDOG: eth0: transmit timed out tg3: eth0: transmit timed out, resetting tg3: tg3_stop_block timed out, ofs=2c00 enable_bit=2 tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2 tg3: eth0: Link is down. tg3: eth0: Link is up at 1000 Mbps, full

Re: [OMPI users] equivalent to mpichgm --gm-recv blocking?

2008-03-18 Thread Patrick Geoffray
Hi Greg, Siekas, Greg wrote: Is it possible to get the same blocking behavior with openmpi? I'm having a difficult time getting this to work properly. The application is spinning on sched_yield which takes up a cpu core. Per its design, OpenMPI cannot block. sched_yield is all it can do to

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Patrick Geoffray
Brightwell, Ronald wrote: I'm looking at a network where the number of endpoints is large enough that everybody can't have a credit to start with, and the "offender" isn't any single process, but rather a combination of processes doing N-to-1 where N is sufficiently large. I can't just tell one

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Patrick Geoffray
Ron, Brightwell, Ronald wrote: Not to muddy the point, but if there's enough ambiguity in the Standard for people to ignore the progress rule, then I think (hope) there's enough ambiguity for people to ignore the sender throttling issue too ;) I understand your position, and I used to agree

Re: [OMPI users] mixed myrinet/non-myrinet nodes

2008-01-15 Thread Patrick Geoffray
Hi Matt, M Jones wrote: I thought that we would be able to use a single open-mpi build to support both networks (and users would be able to request mx nodes if they need them using the batch queuing system, which they are already accustomed to). Am I missing something (or just doing I don't

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-12-02 Thread Patrick Geoffray
flat access time to the South Bridge, but cache locality is still important so CPU affinity is always a good thing to do. Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com

Re: [O-MPI users] LAM vs OPENMPI performance

2006-01-04 Thread Patrick Geoffray
of time when you don't have several processors. importantly, how to I reconfigure OPENMPI to match the LAM performance. Try disabling the shared memory device in OpenMPI. Unfortunately, I have no clue how to do it. Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com

Re: [O-MPI users] [Beowulf] MPI ABI

2005-10-10 Thread Patrick Geoffray
several ABIs at runtime. Politically speaking, 1) will never happen. MorphMPI could do 2), but it's not a silver bullet. Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com

[O-MPI users] Re: [Beowulf] Alternative to MPI ABI

2005-03-25 Thread Patrick Geoffray
Greg Lindahl wrote: On Fri, Mar 25, 2005 at 06:03:15PM -0500, Patrick Geoffray wrote: What Jeff thought is a nightmare, I believe, is to have to decide a common interface and then force the MPI implementations to adopt this interface internally instead of having them translating on the fly

[O-MPI users] Re: [Beowulf] Alternative to MPI ABI

2005-03-25 Thread Patrick Geoffray
tion layer, you translate pointers into integers by putting them in a table. You have as much work as your internals are far from the common interface and, hopefully, it will be a midpoint for everybody. Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com