Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-31 Thread Jeff Squyres (jsquyres)
t name (e.g. www.open-mpi.org) and can > > contains wildcards (e.g. *.open-mpi.org) > > so if the first condition is met, then you should be able to reuse the > > certificate that was previously used at UI. > > > > makes sense ? > > > > Cheers, > >

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-15 Thread Jeff Squyres (jsquyres)
On Aug 15, 2014, at 5:39 PM, Maxime Boissonneault wrote: > Correct. > > Can it be because torque (pbs_mom) is not running on the head node and > mpiexec attempts to contact it ? Not for Open MPI's mpiexec, no. Open MPI's mpiexec (mpirun -- they're the

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-16 Thread Jeff Squyres (jsquyres)
014-08-15 17:50, Jeff Squyres (jsquyres) a écrit : >> On Aug 15, 2014, at 5:39 PM, Maxime Boissonneault >> <maxime.boissonnea...@calculquebec.ca> wrote: >> >>> Correct. >>> >>> Can it be because torque (pbs_mom) is not running on the head no

Re: [OMPI users] Intermittent, somewhat architecture-dependent hang with Open MPI 1.8.1

2014-08-16 Thread Jeff Squyres (jsquyres)
Have you tried moving your shared memory backing file directory, like the warning message suggests? I haven't seen a shared memory file on a network share cause correctness issues before (just performance issues), but I could see how that could be in the realm of possibility... Also, are you

Re: [OMPI users] Does multiple Irecv means concurrent receiving ?

2014-08-23 Thread Jeff Squyres (jsquyres)
On Aug 20, 2014, at 3:37 AM, Zhang,Lei(Ecom) wrote: > I have a performance problem with receiving. In a single master thread, I > made several Irecv calls: > > Irecv(buf1, ..., tag, ANY_SOURCE, COMM_WORLD) > Irecv(buf2, ..., tag, ANY_SOURCE, COMM_WORLD) > ... >

Re: [OMPI users] openmpi-1.8.1 Unable to compile on CentOS6.5

2014-08-26 Thread Jeff Squyres (jsquyres)
Just to elaborate: as the error message implies, this error message was put there specifically to ensure that the Fortran compiler works before continuing any further. If the Fortran compiler is busted, configure exits with this help message. You can either fix your Fortran compiler, or use

Re: [OMPI users] 答复: 答复: Does multiple Irecv means concurrent receiving ?

2014-08-27 Thread Jeff Squyres (jsquyres)
On Aug 27, 2014, at 9:21 AM, Zhang,Lei(Ecom) wrote: > The problem is that I profiled the receiving node and found that its network > bandwidth is used only less than 50%. How did you profile that? > That's why I want to find ways to increase the receiving throughput. Any

Re: [OMPI users] Open MPI 1.6.5 or 1.8.1 Please respond to swa...@us.ibm.com

2014-09-02 Thread Jeff Squyres (jsquyres)
Please send the information listed here: http://www.open-mpi.org/community/help/ On Sep 2, 2014, at 2:10 PM, Swamy Kandadai wrote: > Hi: > While building OpenMPI (1.6.5 or 1.8.1) using openib on our power8 cluster > with Mellanox IB (FDR) I get the following error: >

Re: [OMPI users] How does binding option affect network traffic?

2014-09-02 Thread Jeff Squyres (jsquyres)
Ah, ok -- I think I missed this part of the thread: each of your individual MPI processes suck up huge gobbs of memory. So just to be clear, in general: you don't intend to run more MPI processes than cores per server, *and* you intend to run fewer MPI processes per server than would consume

Re: [OMPI users] Issues with OpenMPI 1.8.2, GCC 4.9.1, and SLURM Interactive Jobs

2014-09-02 Thread Jeff Squyres (jsquyres)
job. > > So when Matt adds --debug-daemons, he then sees the error messages. When he > further adds the oob and plm verbosity, the true error is fully exposed. > > > On Sep 2, 2014, at 2:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > >> Matt

Re: [OMPI users] Issues with OpenMPI 1.8.2, GCC 4.9.1, and SLURM Interactive Jobs

2014-09-04 Thread Jeff Squyres (jsquyres)
On Sep 3, 2014, at 9:27 AM, Matt Thompson wrote: > Just saw this, sorry. Our srun is indeed a shell script. It seems to be a > wrapper around the regular srun that runs a --task-prolog. What it > does...that's beyond my ken, but I could ask. My guess is that it probably >

Re: [OMPI users] Issues with OpenMPI 1.8.2, GCC 4.9.1, and SLURM Interactive Jobs

2014-09-04 Thread Jeff Squyres (jsquyres)
how it is affecting Open MPI's argument passage. Matt On Thu, Sep 4, 2014 at 8:04 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote: On Sep 3, 2014, at 9:27 AM, Matt Thompson <fort...@gmail.com<mailto:fort...@gmail.com>> wrote: &

Re: [OMPI users] Issues with OpenMPI 1.8.2, GCC 4.9.1, and SLURM Interactive Jobs

2014-09-04 Thread Jeff Squyres (jsquyres)
; Still begs the bigger question, though, as others have used script wrappers > before - and I'm not sure we (OMPI) want to be in the business of dictating > the scripting language they can use. :-) > > Jeff and I will argue that one out > > > On Sep 4, 2014, at 7

Re: [OMPI users] How does binding option affect network traffic?

2014-09-05 Thread Jeff Squyres (jsquyres)
do not know how much is too much. Ganglia reports Gigabit Ethernet usage, but we're primarily using IB. -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres (jsquyres) Sent: Tuesday, September 02, 2014 5:41 PM To: Open MPI User's List Subject: Re:

Re: [OMPI users] [Error running] OpenMPI after the installation of Torque (PBS)

2014-09-10 Thread Jeff Squyres (jsquyres)
Can you send all the information here: http://www.open-mpi.org/community/help/ On Sep 10, 2014, at 5:43 AM, Red Red wrote: > This is the version: mpirun (Open MPI) 1.7.5a1r30774. > > Thank you for your interest. > > > 2014-09-10 10:41 GMT+01:00 Ralph Castain

Re: [OMPI users] Forcing OpenMPI to use Ethernet interconnect instead of InfiniBand

2014-09-10 Thread Jeff Squyres (jsquyres)
Are you inadvertently using the MXM MTL? That's an alternate Mellanox transport that may activate itself, even if you've disabled the openib BTL. Try this: mpirun --mca pml ob1 --mca btl ^openib ... This forces the use of the ob1 PML (which forces the use of the BTLs, not the MTLs), and

Re: [OMPI users] oepnmpi-1.8.2 cann't complete configure

2014-09-16 Thread Jeff Squyres (jsquyres)
On Sep 14, 2014, at 3:15 AM, Ahmed Salama wrote: > I compile openmpi-1.8.2 with following compile: > ./configure --enable-mpi-java --with-jdk-bindir=/usr/jdk6/bin > --with-jdk-headers=/usr/jdk6/include --prefix=/usr/openmpi8 > > but configuration not complete give

Re: [OMPI users] About debugging and asynchronous communication

2014-09-18 Thread Jeff Squyres (jsquyres)
On Sep 18, 2014, at 2:43 AM, XingFENG wrote: > a. How to get more information about errors? I got errors like below. This > says that program exited abnormally in function MPI_Test(). But is there a > way to know more about the error? > > *** An error occurred in

Re: [OMPI users] How does binding option affect network traffic?

2014-09-18 Thread Jeff Squyres (jsquyres)
On Sep 5, 2014, at 11:49 PM, Ralph Castain wrote: > It would be about the worst thing you can do, to be honest. Reason is that > each socket is typically a separate NUMA region, and so the shared memory > system would be sub-optimized in that configuration. It would be much

Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots (updated findings)

2014-09-18 Thread Jeff Squyres (jsquyres)
gt; >>>>> Here are the definitions of the two parallel environments tested (with >>>>> orte always failing when >>>>> more slots are requested than there are CPU cores on the first node >>>>> allocated to the job by >>>>>

Re: [OMPI users] OpenMPI 1.8.3 build without BTL

2014-09-30 Thread Jeff Squyres (jsquyres)
How can you run MPI jobs at all without any BTLs? That sounds weird -- this is not a case for which we designed the code base. All that being said, you're getting compile errors in the OMPI build because of two things: - you selected to build static - you didn't disable enough stuff

Re: [OMPI users] SENDRECV + MPI_TYPE_CREATE_STRUCT

2014-10-03 Thread Jeff Squyres (jsquyres)
On Oct 3, 2014, at 10:38 AM, Diego Avesani wrote: > Dear all, Dear Jeff, > when I use > use MPI, I get > > /tmp/ifortiW8IBH.i90: catastrophic error: **Internal compiler error: > segmentation violation signal raised** Please report this error along with > the

Re: [OMPI users] SENDRECV + MPI_TYPE_CREATE_STRUCT

2014-10-03 Thread Jeff Squyres (jsquyres)
On Oct 3, 2014, at 10:55 AM, Diego Avesani wrote: > Dear Jeff, > how can I do that? Er... can you be more specific? I mentioned several things in my email. If you're asking about how to re-install OMPI compiled with -r8, please first read Nick's email (essentially

Re: [OMPI users] SENDRECV + MPI_TYPE_CREATE_STRUCT

2014-10-03 Thread Jeff Squyres (jsquyres)
On Oct 3, 2014, at 3:50 PM, George Bosilca wrote: > 1. I’m not a Fortran expert but I think that real is not MPI_DOUBLE_RECISION > but MPI_FLOAT. It's actually MPI_REAL. :-) (MPI_FLOAT is for the C "float" type) /me goes back in my Fortran hole... -- Jeff Squyres

Re: [OMPI users] Fortran wrapper libraries

2014-10-14 Thread Jeff Squyres (jsquyres)
On Oct 14, 2014, at 12:33 AM, Marc-Andre Hermanns wrote: >> No. Also note that in OMPI 1.7/1.8, we have renamed the Fortran >> wrapper to be mpifort -- mpif77 and mpif90 are sym links to mpifort >> provided simply for backwards compatibility. > > Thanks for the heads

Re: [OMPI users] static for tools dynamic for libs

2014-10-14 Thread Jeff Squyres (jsquyres)
This is not a common use case because the LD_LIBRARY_PATH requirements for the orted and friends are the same as for the MPI executables. So reducing the dependency requirements for the orted doesn't really buy you much. However, you could probably do this manually: - # Build static

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Jeff Squyres (jsquyres)
Gus -- Can you send the output of configure and your config.log? On Oct 16, 2014, at 4:24 PM, Gus Correa wrote: > On 10/16/2014 05:38 PM, Nathan Hjelm wrote: >> On Thu, Oct 16, 2014 at 05:27:54PM -0400, Gus Correa wrote: >>> Thank you, Aurelien! >>> >>> Aha, "vader

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-17 Thread Jeff Squyres (jsquyres)
On Oct 16, 2014, at 1:35 PM, Gus Correa wrote: > and on the MCA parameter file: > > btl_sm_use_knem = 1 I think the logic enforcing this MCA param got broken when we revamped the MCA param system. :-( > I am scratching my head to understand why a parameter with such

Re: [OMPI users] [FEniCS] Question about MPI barriers

2014-10-17 Thread Jeff Squyres (jsquyres)
Thanks; I filed https://github.com/open-mpi/ompi/issues/242. On Oct 17, 2014, at 5:59 AM, Jed Brown wrote: > Martin Sandve Alnæs writes: > >> Thanks, but ibarrier doesn't seem to be in the stable version of openmpi: >> http://www.open-mpi.org/doc/v1.8/

Re: [OMPI users] low CPU utilization with OpenMPI

2014-10-23 Thread Jeff Squyres (jsquyres)
If normal users can't write to /tmp (or if /tmp is an NFS-mounted filesystem), that's the underlying problem. @Vinson -- you should probably try to get that fixed. On Oct 23, 2014, at 10:35 AM, Joshua Ladd wrote: > It's not coming from OSHMEM but from the OPAL "shmem"

Re: [OMPI users] low CPU utilization with OpenMPI

2014-10-24 Thread Jeff Squyres (jsquyres)
nson Leung <lwhvinson1...@gmail.com> wrote: > Later I change another machine and set the TMPDIR to default /tmp, but the > problem (low CPU utilization under 20%) still occur :< > > Vincent > > On Thu, Oct 23, 2014 at 10:38 PM, Jeff Squyres (jsquyres) > <jsquy...@cisc

Re: [OMPI users] New ib locked pages behavior?

2014-10-24 Thread Jeff Squyres (jsquyres)
On Oct 22, 2014, at 3:37 AM, r...@q-leap.de wrote: > I've commented in detail on this (non-)issue on 2014-08-20: > > http://www.open-mpi.org/community/lists/users/2014/08/25090.php > > A change in the FAQ and a fix in the code would really be nice > at this stage. Thanks for the reminder; I've

Re: [OMPI users] Problem with Yosemite

2014-10-24 Thread Jeff Squyres (jsquyres)
Ralph -- Can you try a 1.8 nightly tarball build on Y? On Oct 24, 2014, at 12:32 PM, Ralph Castain wrote: > Could well be - I’m using the libtool from Apple > > Apple Inc. version cctools-855 > > Just verified that 1.8 is working fine as well. > Ralph > > >> On Oct

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-24 Thread Jeff Squyres (jsquyres)
Nathan tells me that this may well be related to a fix that was literally just pulled into the v1.8 branch today: https://github.com/open-mpi/ompi-release/pull/56 Would you mind testing any nightly tarball after tonight? (i.e., the v1.8 tarballs generated tonight will be the first ones to

Re: [OMPI users] OpenMPI 1.8.3 configure fails, Mac OS X 10.9.5, Intel Compilers

2014-10-28 Thread Jeff Squyres (jsquyres)
It sounds like your intel compiler installation is broken -- these types of "present but not compilable" kinds of errors usually indicate that the compiler itself has some kind of local conflict that is unrelated to Open MPI (that's why we put those tests in OMPI's configure -- so that we can

Re: [OMPI users] MPI_Init seems to hang, but works after a, minute or two

2014-10-28 Thread Jeff Squyres (jsquyres)
On Oct 27, 2014, at 1:25 PM, maxinator333 wrote: > Deactivating my WLAN did indeed the trick! > It also seems to not work, if a LAN-cable is plugged in. No difference if I > am correctly connected (to the internet/gateway) or not (wrong IP, e.g. > static given IP

Re: [OMPI users] MPI_Init seems to hang, but works after a, minute or two

2014-10-28 Thread Jeff Squyres (jsquyres)
On Oct 28, 2014, at 9:02 AM, maxinator333 wrote: > It doesn't seem to work. (switching off wlan still works) > mpicc mpiinit.c -o mpiinit.exe; time mpirun --mca btl sm,self -n 2 > ./mpiinit.exe > > real0m43.733s > user0m0.888s > sys 0m0.824s Ah, this

Re: [OMPI users] Java FAQ Page out of date

2014-10-28 Thread Jeff Squyres (jsquyres)
Thanks Brock; I opened https://github.com/open-mpi/ompi/issues/254 to track the issue. On Oct 27, 2014, at 12:57 AM, Brock Palen wrote: > I think a lot of the information on this page: > > http://www.open-mpi.org/faq/?category=java > > Is out of date with the 1.8 release.

Re: [OMPI users] Allgather in OpenMPI 1.4.3

2014-10-29 Thread Jeff Squyres (jsquyres)
Can you at least upgrade to 1.4.5? That's the last release in the 1.4.x series. Note that you can always install Open MPI as a normal/non-root user (e.g., install it into your $HOME, or some such). On Oct 28, 2014, at 12:08 PM, Sebastian Rettenberger wrote: > Hi, > > I

[OMPI users] Fwd: [Open MPI Announce] Open MPI at SC14

2014-11-03 Thread Jeff Squyres (jsquyres)
Re-sending to the users list, just in case there's some people here on the users list who aren't on the announce list. Begin forwarded message: > From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> > Subject: [Open MPI Announce] Open MPI at SC14 > Date: October 2

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-11-04 Thread Jeff Squyres (jsquyres)
Looks like it's failing in the openib BTL setup. Can you send the info listed here? http://www.open-mpi.org/community/help/ On Nov 4, 2014, at 1:10 PM, Saliya Ekanayake wrote: > Hi, > > I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It builds >

Re: [OMPI users] mpirun error

2014-11-04 Thread Jeff Squyres (jsquyres)
On Nov 4, 2014, at 5:56 PM, jfsanchez wrote: > mpirun -np 4 test Try: mpirun -np 4 ./test To specifically get the "test" executable in your directory (vs. /bin/test, which OMPI may have found in your PATH). -- Jeff Squyres jsquy...@cisco.com For corporate

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Jeff Squyres (jsquyres)
Yes, this is a correct report. In short, the MPI_SIZEOF situation before the upcoming 1.8.4 was a bit of a mess; it actually triggered a bunch of discussion up in the MPI Forum Fortran working group (because the design of MPI_SIZEOF actually has some unintended consequences that came to light

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-11-05 Thread Jeff Squyres (jsquyres)
ou: >> In openmpi-dev-176-g9334abc.tar.gz the new-introduced bugfix concerning >> the shared memory allocation may be not yet correctly coded , >> or that version contains another new bug in sharedmemory allocation >> compared to the working(!) 1.8.3-release version. >

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Jeff Squyres (jsquyres)
On Nov 5, 2014, at 9:59 AM, wrote: > In my sharedmemtest.f90 coding just sent to you, > I have added a call of MPI_SIZEOF (at present it is deactivated, because of > the missing Ftn-binding in OPENMPI-1.8.3). FWIW, I attached one of the

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Jeff Squyres (jsquyres)
Meh. I forgot to attach the test. :-) Here it is. On Nov 5, 2014, at 10:46 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > On Nov 5, 2014, at 9:59 AM, <michael.rach...@dlr.de> <michael.rach...@dlr.de> > wrote: > >> In my sharedmemtest.f90 cod

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Jeff Squyres (jsquyres)
On Nov 5, 2014, at 12:23 PM, Dave Love wrote: > Is the issue documented publicly? I'm puzzled, because it certainly > works in a simple case: There were several commits; this was the first one:

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-06 Thread Jeff Squyres (jsquyres)
On Nov 6, 2014, at 5:37 AM, wrote: > a) When looking in your mpi_sizeof_mpifh.f90 test program I found a little > thing: You may (but need not) change the name of the integer variable size >to e.g. isize , because size is just an

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-06 Thread Jeff Squyres (jsquyres)
On Nov 6, 2014, at 8:55 AM, wrote: > I agree fully with omitting the explicit interfaces from mpif.h . It is an > important resort for legacy codes. > But, in the mpi and mpi_f08 module explicit interfaces are required for > all(!)

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-08 Thread Jeff Squyres (jsquyres)
Ralph is right: OMPI aggressively uses all Ethernet interfaces by default. This short FAQ has links to 2 other FAQs that provide detailed information about reachability: http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network The usNIC BTL uses UDP for its wire transport and actually

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-11-10 Thread Jeff Squyres (jsquyres)
chance to look at this. > > Thanks, > Saliya > > On Thu, Nov 6, 2014 at 9:19 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > Hi Jeff, > > I've attached a tar file with information. > > Thank you, > Saliya > > On Tue, Nov 4, 2014 at 4:18 PM, Jeff Sq

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-10 Thread Jeff Squyres (jsquyres)
e for my own curiosity. >>>> The reason I mention the Resource Manager we use, and that the hostnames >>>> given but PBS/Torque match the 1gig-e interfaces, i'm curious what path it >>>> would take to get to a peer node when the node list given all match t

Re: [OMPI users] MPI_Wtime not working with -mno-sse flag

2014-11-10 Thread Jeff Squyres (jsquyres)
On some platforms, the MPI_Wtime function essentially uses gettimeofday() under the covers. See this stackoverflow question about -mno-sse: http://stackoverflow.com/questions/3687845/error-with-mno-sse-flag-and-gettimeofday-in-c On Nov 10, 2014, at 8:35 AM, maxinator333

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-11 Thread Jeff Squyres (jsquyres)
On Nov 11, 2014, at 9:38 AM, Dave Love wrote: >> 1. All modern compilers have ignore-TKR syntax, > > Hang on! (An equivalent of) ignore_tkr only appeared in gfortran 4.9 > (the latest release) as far as I know. The system compiler of the bulk > of GNU/Linux HPC systems

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-11 Thread Jeff Squyres (jsquyres)
On Nov 11, 2014, at 9:43 AM, Dave Love wrote: > I haven't checked the source, but the commit message above says > > If the Fortran compiler supports both INTERFACE and ISO_FORTRAN_ENV, > then we'll build the MPI_SIZEOF interfaces. If not, we'll skip > MPI_SIZEOF in

Re: [OMPI users] mpirun fails across nodes

2014-11-12 Thread Jeff Squyres (jsquyres)
Do you have firewalling enabled on either server? See this FAQ item: http://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems On Nov 12, 2014, at 4:57 AM, Syed Ahsan Ali wrote: > Dear All > > I need your advice. While trying to run mpirun job

Re: [OMPI users] 1.8.4

2014-11-12 Thread Jeff Squyres (jsquyres)
We have 2 critical issues left that need fixing (a THREAD_MULTIPLE/locking issue and a shmem issue). There's active work progressing on both. I think we'd love to say it would be ready by SC, but I know that a lot of us -- myself included -- are fighting to meet our own SC deadlines. Ralph

Re: [OMPI users] 1.8.4

2014-11-12 Thread Jeff Squyres (jsquyres)
On Nov 12, 2014, at 9:53 AM, Ray Sheppard wrote: > Thanks, and sorry to blast my little note out to the list. I guess your mail > address is now aliased to the mailing list in my mail client. :-) No worries; I'm sure this is a question on other people's minds, too. -- Jeff

Re: [OMPI users] mmaped memory and openib btl.

2014-11-12 Thread Jeff Squyres (jsquyres)
FWIW, munmap is *supposed* to be intercepted. Can you confirm that when your application calls munmap, it doesn't make a call to libopen-pal.so? It should be calling this (1-line) function: - /* intercept munmap, as the user can give back memory that way as well. */ OPAL_DECLSPEC int

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-14 Thread Jeff Squyres (jsquyres)
I lurked on this thread for a while, but I have some thoughts on the many issues that were discussed on this thread (sorry, I'm still pretty under water trying to get ready for SC next week...). These points are in no particular order... 0. Two fundamental points have been missed in this

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-14 Thread Jeff Squyres (jsquyres)
On Nov 14, 2014, at 10:52 AM, Reuti wrote: > I appreciate your replies and will read them thoroughly. I think it's best to > continue with the discussion after SC14. I don't want to put any burden on > anyone when time is tight. Cool; many thanks. This is

Re: [OMPI users] error building openmpi-dev-274-g2177f9e with Sun C 5.12

2014-11-14 Thread Jeff Squyres (jsquyres)
Todd K. just reported the same thing: https://github.com/open-mpi/ompi/issues/272 Siegmar: do you have a github ID? If so, we can effectively "CC" you on these kinds of tickets, like we used to do with Trac. On Nov 14, 2014, at 12:04 PM, Siegmar Gross

Re: [OMPI users] error building openmpi-dev-274-g2177f9e with gcc-4.9.2

2014-11-14 Thread Jeff Squyres (jsquyres)
Siegmar -- This issue should now be fixed, too. On Nov 14, 2014, at 12:04 PM, Siegmar Gross wrote: > Hi, > > today I tried to install openmpi-dev-274-g2177f9e on my machines > (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 > x86_64) with

[OMPI users] Open MPI SC'14 BOF slides

2014-11-20 Thread Jeff Squyres (jsquyres)
For those of you who weren't able to be at the SC'14 BOF yesterday -- and even for those of you who were there and wanted to be able to read the slides in a little more detail (and get the links from the slides) -- I have posted them here: http://www.open-mpi.org/papers/sc-2014/ Enjoy!

Re: [OMPI users] Open MPI SC'14 BOF slides: mpif.h --> module mpi

2014-11-21 Thread Jeff Squyres (jsquyres)
dule), you can (and should!!) employ > theimplicit none -stmt inside the mpi-module itself: > > module mpi >implicit none >integer MPI_... > contains >... > end module mpi > > > Greetings > Michael Rachner > > > > -

Re: [OMPI users] Fwd: [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-11-27 Thread Jeff Squyres (jsquyres)
On Nov 26, 2014, at 2:08 PM, Nick Papior Andersen wrote: > Here is my commit-msg: > " > We can now split communicators based on hwloc full capabilities up to BOARD. > I.e.: > HWTHREAD,CORE,L1CACHE,L2CACHE,L3CACHE,SOCKET,NUMA,NODE,BOARD > where NODE is the same as SHARED. >

Re: [OMPI users] Fwd: [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-11-27 Thread Jeff Squyres (jsquyres)
nd commit to make them OMPI specific. > > I will post forward my problems on the devel list. > > I will keep you posted. :) > > 2014-11-27 13:58 GMT+01:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: > On Nov 26, 2014, at 2:08 PM, Nick Papior Andersen <nickpap...

Re: [OMPI users] Fwd: [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-12-01 Thread Jeff Squyres (jsquyres)
On Nov 28, 2014, at 11:58 AM, George Bosilca wrote: > The same functionality can be trivially achieved at the user level using > Adam's approach. If we provide a shortcut in Open MPI, we should emphasize > this is an MPI extension, and offer the opportunity to other MPI to

Re: [OMPI users] "default-only MCA variable"?

2014-12-01 Thread Jeff Squyres (jsquyres)
On Dec 1, 2014, at 12:47 PM, Ralph Castain wrote: > I’m not entirely familiar with the thinking behind it, but it appears that > some MCA params are intended solely for reporting purposes and are therefore > not really “settable”. The “have_knem_support” is one such example,

Re: [OMPI users] [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-12-02 Thread Jeff Squyres (jsquyres)
On Dec 2, 2014, at 1:10 AM, George Bosilca wrote: >> Are you referring to something Adam Moody proposed? Or some other Adam? > > He did more than proposing, he provided a link to the implementation in SCR. > So yes, I was indeed referring to Adam Moody. Ah -- you're

Re: [OMPI users] mmaped memory and openib btl.

2014-12-02 Thread Jeff Squyres (jsquyres)
ion win, rather than just DSO precedence ? >> >> Best regards, >> >> E. >> >> On Wed, Nov 12, 2014 at 7:51 PM, Emmanuel Thomé >> <emmanuel.th...@gmail.com> wrote: >>> yes I confirm. Thanks for saying that this is the supposed behaviour. >&g

Re: [OMPI users] Noob installation problem

2014-12-02 Thread Jeff Squyres (jsquyres)
This is new -- I haven't seen configure fail for libevent before. :-) Can you send the file opal/mca/event/libevent2021/libevent/config.log? (please compress) On Dec 2, 2014, at 2:20 PM, Wildes Andrew wrote: > Hi Timothy, > > Many thanks for your e-mail, and my

Re: [OMPI users] mmaped memory and openib btl.

2014-12-02 Thread Jeff Squyres (jsquyres)
On Dec 2, 2014, at 3:15 PM, Emmanuel Thomé wrote: > Thanks for pointing me to ummunotify, this sounds much more robust > than the fragile hook-based approach. I'll try this out. It is -- see: https://github.com/open-mpi/ompi/blob/master/README#L665-L682

Re: [OMPI users] Noob installation problem

2014-12-02 Thread Jeff Squyres (jsquyres)
Ah. The problem is that you're trying to build OMPI in a directory that contains a space: /Users/wildes/Desktop/untitled folder/openmpi-1.8.3 Make that "untitled-folder" or "untitled_folder" or even "foo" -- anything without a space. Then you should be good. Most Linux/POSIX-ish systems

Re: [OMPI users] problems with openmpi-dev-428-g983bd49

2014-12-04 Thread Jeff Squyres (jsquyres)
Hi Siegmar -- Sorry for the delay; your c_funloc issue is definitely on my queue. Just haven't had a chance to get to it yet. :-( On Dec 4, 2014, at 11:06 AM, Siegmar Gross wrote: > Hi, > > today I tried to install openmpi-dev-428-g983bd49 on my

Re: [OMPI users] problems with openmpi-dev-428-g983bd49

2014-12-04 Thread Jeff Squyres (jsquyres)
One thing I meant to ask: has this been happening for a long time? Or did you just start building the Fortran bindings? I ask because we haven't (intentionally) changed much in this particular area recently. On Dec 4, 2014, at 11:15 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>

Re: [OMPI users] OpenMPI 1.8.4 and hwloc in Fedora 14 using a beta gcc 5.0 compiler.

2014-12-15 Thread Jeff Squyres (jsquyres)
FWIW, if it would be easier, we can just pull a new hwloc tarball -- that's how we've done it in the past (vs. trying to pull individual patches). It's also easier to pull a release tarball, because then we can say "hwloc vX.Y.Z is in OMPI vA.B.C", rather than have to try to examine/explain

Re: [OMPI users] OpenMPI 1.8.4 and hwloc in Fedora 14 using a beta gcc 5.0 compiler.

2014-12-15 Thread Jeff Squyres (jsquyres)
have been clearer - that was indeed what I was expecting to > see. I guess it begs the question - should we just update to something like > 1.9 so Brice doesn't have to worry about back porting future fixes this far > back? > > > > On Mon, Dec 15, 2014 at 7:22 AM,

Re: [OMPI users] disable library interposition?

2014-12-16 Thread Jeff Squyres (jsquyres)
You can disable it a few different ways. The easiest way is probably to set the "OMPI_MCA_memory_linux_disable" environment variable to "yes" before you launch mpirun. On Dec 16, 2014, at 5:15 AM, tom fogal wrote: > I somewhat arbitrarily came across this bug: > >

Re: [OMPI users] ERROR: C_FUNLOC function

2014-12-17 Thread Jeff Squyres (jsquyres)
Siegmar -- This fix was just pushed to the OMPI master. A new master tarball should be available shortly (probably within an hour or so -- look for a tarball dated Dec 17 at http://www.open-mpi.org/nightly/master/). I anticipate that this fix will also make it in for the v1.8.4 release (see

Re: [OMPI users] ERROR: C_FUNLOC function

2014-12-17 Thread Jeff Squyres (jsquyres)
Siegmar -- I filed https://github.com/open-mpi/ompi/issues/317 and https://github.com/open-mpi/ompi/issues/318. On Dec 17, 2014, at 3:33 PM, Siegmar Gross wrote: > Hi Jeff, > >> This fix was just pushed to the OMPI master. A new master tarball >>

Re: [OMPI users] [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-12-17 Thread Jeff Squyres (jsquyres)
Returning to a super-old thread that was never finished... On Dec 2, 2014, at 6:49 PM, George Bosilca wrote: > That's not enough. They will have to check for the right version of Open MPI > and then for the availability of the OMPI_ functions. That looks > as having the

Re: [OMPI users] Operators for MPI handles not correctly overloaded with Intel Fortran

2014-12-17 Thread Jeff Squyres (jsquyres)
Jorg -- I'm sorry for the giant delay in replying; the US holiday and the MPI Forum meeting last week made a disaster out of my already-out-of-control INBOX. :-( Hmm. This almost sounds like a bug in the intel compiler. Do you have the latest version of their compiler, perchance? On Dec

Re: [OMPI users] Deadlock in OpenMPI 1.8.3 and PETSc 3.4.5

2014-12-18 Thread Jeff Squyres (jsquyres)
On Dec 17, 2014, at 10:11 PM, Howard Pritchard wrote: > (hmm.. no MPI_Comm_attr_get man page, > needs to be fixed) FWIW: The function is actually MPI_COMM_GET_ATTR; its man page is MPI_Comm_get_attr.3. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go

Re: [OMPI users] Operators for MPI handles not correctly overloaded with Intel Fortran

2014-12-18 Thread Jeff Squyres (jsquyres)
t I could not identify any differencs > that might be responsible for the problem observed with OS X. > > If you have any hint what to try or look for, please let me know. In the > meantime I am fine with the static libs. > > Regards, > Jorg > > > Am 18.12.20

Re: [OMPI users] [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-12-18 Thread Jeff Squyres (jsquyres)
On Dec 17, 2014, at 9:52 PM, George Bosilca wrote: >> I don't understand how MPIX_ is better. >> >> Given that there is *zero* commonality between any MPI extension implemented >> between MPI implementations, how exactly is having the same prefix any less >> confusing? >

Re: [OMPI users] [EXTERNAL] Re: How to find MPI ranks located in remote nodes?

2014-12-19 Thread Jeff Squyres (jsquyres)
On Dec 19, 2014, at 2:48 AM, George Bosilca wrote: > We made little progress over the last couple of [extremely long] emails and > the original topic diverged and got diluted. Lets hold on our discussion here > and let Nick, Keita and the others go ahead and complete their

Re: [OMPI users] Deadlock in OpenMPI 1.8.3 and PETSc 3.4.5

2014-12-19 Thread Jeff Squyres (jsquyres)
George: (I'm not a member of petsc-maint; I have no idea whether my mail will actually go through to that list) TL;DR: I do not think that George's change was correct. PETSC is relying on undefined behavior in the MPI standard and should probably update to use a different scheme. More

Re: [OMPI users] Deadlock in OpenMPI 1.8.3 and PETSc 3.4.5

2014-12-19 Thread Jeff Squyres (jsquyres)
On Dec 19, 2014, at 8:58 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > More specifically, George's change can lead to inconsistency/incorrectness in > the presence of multiple threads simultaneously executing attribute actions > on a single entity. Actually -- it

Re: [OMPI users] Deadlock in OpenMPI 1.8.3 and PETSc 3.4.5

2014-12-19 Thread Jeff Squyres (jsquyres)
On Dec 19, 2014, at 10:44 AM, George Bosilca wrote: > Regarding your second point, while I do tend to agree that such issue is > better addressed in the MPI Forum, the last attempt to fix this was certainly > not a resounding success. Yeah, fair enough -- but it wasn't a

Re: [OMPI users] Whether to use the IB BTL or not

2015-01-05 Thread Jeff Squyres (jsquyres)
In addition to what Howard said, there's actually two other metrics that are used, as well. Each BTL exports a priority and an exclusivity value. IIRC (it's been a while since I've looked at this code), the PML gathers up all BTL modules that claim that they can communicate between a pair of

Re: [OMPI users] Accessing Process Affinity within MPI Program

2015-01-06 Thread Jeff Squyres (jsquyres)
Sorry for the delay in answering this; this mail came after I disappeared for the US holidays. Yes -- through an Open MPI extension (you must configure Open MPI with --enable-mpi-ext=affinity or --enable-mpi-ext=all). See: http://www.open-mpi.org/doc/v1.8/man3/OMPI_Affinity_str.3.php

Re: [OMPI users] Accessing Process Affinity within MPI Program

2015-01-07 Thread Jeff Squyres (jsquyres)
an 6, 2015 at 4:37 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > Sorry for the delay in answering this; this mail came after I disappeared for > the US holidays. > > Yes -- through an Open MPI extension (you must configure Open MPI with > --enable-mpi-ext=

Re: [OMPI users] send and receive vectors + variable length

2015-01-08 Thread Jeff Squyres (jsquyres)
What do you need the barriers for? On Jan 8, 2015, at 1:44 PM, Diego Avesani wrote: > Dear all, > I found the error. > There is a Ndata2send(iCPU) instead of Ndata2recv(iCPU). > In the attachment there is the correct version of the program. > > Only one thing, could

Re: [OMPI users] send and receive vectors + variable length

2015-01-08 Thread Jeff Squyres (jsquyres)
Also, you are calling WAITALL on all your sends and then WAITALL on all your receives. This is also incorrect and may deadlock. WAITALL on *all* your pending requests (sends and receives -- put them all in a single array). Look at examples 3.8 and 3.9 in the MPI-3.0 document. On Jan 8,

Re: [OMPI users] send and receive vectors + variable length

2015-01-09 Thread Jeff Squyres (jsquyres)
communications. In this particular case I see no hard of waiting on the > requests in any random order as long as all of them are posted before the > first wait. > > George. > > > On Thu, Jan 8, 2015 at 5:24 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >

Re: [OMPI users] send and receive vectors + variable length

2015-01-09 Thread Jeff Squyres (jsquyres)
On Jan 9, 2015, at 12:39 PM, George Bosilca wrote: > I totally agree with Dave here. Moreover, based on the logic exposed by Jeff, > there is no right solution because if one choose to first wait on the receive > requests this also leads to a deadlock as the send requests

Re: [OMPI users] send and receive vectors + variable length

2015-01-09 Thread Jeff Squyres (jsquyres)
On Jan 9, 2015, at 1:54 PM, Diego Avesani wrote: > What does it mean "YMMV"? http://netforbeginners.about.com/od/xyz/f/What-Is-YMMV.htm :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to:

Re: [OMPI users] Problem with connecting to 3 or more nodes

2015-01-16 Thread Jeff Squyres (jsquyres)
It's because Open MPI uses a tree-based ssh startup pattern. (amusingly enough, I'm literally half way through writing up a blog entry about this exact same issue :-) ) That is, not only does Open MPI ssh from your mpirun-server to host1, Open MPI may also ssh from host1 to host2 (or host1 to

Re: [OMPI users] configuring a code with MPI/OPENMPI

2015-02-03 Thread Jeff Squyres (jsquyres)
Without knowing anything about the application that you are trying to build, it's really hard to say. You should probably be asking the support mailing lists for that specific application -- they would better be able to support you. This list is for Open MPI, which is likely one of the MPI

  1   2   3   4   5   6   7   8   9   10   >