Re: [OMPI users] This list is suspended while migrating

2016-07-27 Thread Jeff Squyres (jsquyres)
...and we're back. NOTE: the email address for this list has now changed! It is now @lists.open-mpi.org (it used to be @open-mpi.org). PLEASE UPDATE YOUR CONTACTS AND MAIL CLIENT FILTERS! > On Jul 27, 2016, at 12:01 PM, Jeff Squyres (jsquyres) > wrote: > > We are b

[OMPI users] Mailing list migration: status

2016-07-27 Thread Jeff Squyres (jsquyres)
We have transitioned the Open MPI mailing lists to our new best friends at the New Mexico Consortium (http://newmexicoconsortium.org/). Thank you, NMC! For at least a little while, you'll see newmexicoconsortium.org in the footers of our mailing list mails. Eventually, we hope to replace those

Re: [OMPI users] OPENSHMEM ERROR

2016-07-29 Thread Jeff Squyres (jsquyres)
What happens when you run the ring_c test program, do you get the same error as you do with hello_oshmem_c? Can you send all the information listed here: https://www.open-mpi.org/community/help/ > On Jul 29, 2016, at 6:15 AM, Debendra Das wrote: > > I have installed OpenMPI-2.0.0 in 2 s

Re: [OMPI users] OPENSHMEM ERROR

2016-07-29 Thread Jeff Squyres (jsquyres)
> On Jul 29, 2016, at 8:49 AM, Jeff Squyres (jsquyres) > wrote: > > What happens when you run the ring_c test program, do you get the same error > as you do with hello_oshmem_c? I'm guessing ring_c will work, but oshmem_hello will still segv; I was just able to reproduce

Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-30 Thread Jeff Squyres (jsquyres)
Hmm. Sorry about this; we just moved the web site from Indiana University to Host Gator (per http://www.open-mpi.org/community/lists/devel/2016/06/19139.php). I thought I had disabled https for the web site last night when I did the move -- I'll have to check into this. For the meantime, plea

Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-30 Thread Jeff Squyres (jsquyres)
:-\ > On Jul 30, 2016, at 12:27 PM, Bennet Fauber wrote: > > Thanks, Jeff, > > Just to note, though, many, many links in Google searches will have > the https address. > > -- bennet > > > On Sat, Jul 30, 2016 at 12:21 PM, Jeff Squyres (jsquyres) > wrot

Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-30 Thread Jeff Squyres (jsquyres)
On Jul 30, 2016, at 12:39 PM, dpchoudh . wrote: > > Having said that, would it not be possible to redirect an https request to a > http request? I believe apache mod-rewrite can do it. Or does this > certificate check happens even before the rewrite? Yes, the certificate check happens before t

Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-30 Thread Jeff Squyres (jsquyres)
st? I believe apache mod-rewrite can do it. Or does this > certificate check happens even before the rewrite? Regards Durga > > The woods are lovely, dark and deep; but I have promises to keep. And > kilometers to go before I sleep; and kilometers to go before I sleep. On Sat, > Ju

Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-31 Thread Jeff Squyres (jsquyres)
e.g. *.open-mpi.org) > > so if the first condition is met, then you should be able to reuse the > > certificate that was previously used at UI. > > > > makes sense ? > > > > Cheers, > > > > Gilles > > > > On Sunday, July 31, 2016, Jeff Squ

Re: [OMPI users] Ability to overlap communication and computation on Infiniband

2016-08-01 Thread Jeff Squyres (jsquyres)
On Jul 8, 2016, at 4:26 PM, Audet, Martin wrote: > > Hi OMPI_Users and OMPI_Developers, Sorry for the delay in answering, Martin. > I would like someone to verify if my understanding is correct concerning Open > MPI ability to overlap communication and computations on Infiniband when > using

Re: [OMPI users] OPENSHMEM ERROR

2016-08-02 Thread Jeff Squyres (jsquyres)
Debendra -- Can you try the latest v2.0.1 nightly snapshot tarball and see if the problem is resolved for you? https://www.open-mpi.org/nightly/v2.x/ Thanks. > On Jul 29, 2016, at 12:58 PM, Jeff Squyres (jsquyres) > wrote: > >> On Jul 29, 2016, at 8:49 AM, Jeff Sq

Re: [OMPI users] testsome returns negative indices

2014-03-21 Thread Jeff Squyres (jsquyres)
2014-03-21 at 14:11 +, Jeff Squyres (jsquyres) wrote: >> Is that C or R code? > C. >> >> If it's R, I think the next step would be to check the R wrapper for >> MPI_Testsome and see what is actually being returned by OMPI in C before it >> gets converted t

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-21 Thread Jeff Squyres (jsquyres)
~/Task4_mpi/scatterv$ mpiexec -n 2 -host wirth,karp ./a.out > > i receive Error > > [wirth][[59430,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect] > connect() to 10.231.2.231 failed: Connection refused (111) > > > NOTE: Karp and wirth are two machines on ss

Re: [OMPI users] testsome returns negative indices [diagnosis]

2014-03-21 Thread Jeff Squyres (jsquyres)
On Mar 21, 2014, at 4:13 PM, Ross Boylan wrote: > There was a problem in the R code that cause MPI_Request objects to be reused > before the original request completed. > Things are working much better now, though some bugs remain (not necessarily > related to MPI_Isend or Testsome). > > Just

Re: [OMPI users] Segmentation Fault

2014-03-21 Thread Jeff Squyres (jsquyres)
On Mar 21, 2014, at 3:26 AM, madhurima madhunapanthula wrote: > Iam trying to link the jumpshot libraries with the graph500 (mpi_tuned_2d > sources). > After linkin the libraries and executing mpirun with the > graph500_mpi_custome_n binaries Iam getting the following segmenation fault. Are y

Re: [OMPI users] coll_ml_priority in openmpi-1.7.5

2014-03-21 Thread Jeff Squyres (jsquyres)
One of the authors of ML mentioned to me off-list that he has an idea what might have been causing the slowdown. They're actively working on tweaking and making things better. I told them to ping you -- the whole point is that ml is supposed to be *better* than our existing collectives, so if

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Jeff Squyres (jsquyres)
t unable to figure it out. > > I need some more kind suggestions. > > regards. > > > On Fri, Mar 21, 2014 at 6:05 PM, Jeff Squyres (jsquyres) > wrote: > Do you have any firewalling enabled on these machines? If so, you'll want to > either disable it, or al

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Jeff Squyres (jsquyres)
_are_fatal unknown handle > [karp:29513] 1 more process has sent help message help-mpi-runtime.txt / ompi > mpi abort:cannot guarantee all killed > > I tried every combination for btl_tcp_if_include or exclude... > > I cant figure out what is wrong. > I can easily talk wit

Re: [OMPI users] usNIC point-to-point messaging module

2014-03-24 Thread Jeff Squyres (jsquyres)
No, this is not a configure issue -- the usnic BTL uses the verbs API. The usnic BTL should be disqualifying itself at runtime, though, if you don't have usNIC devices. Are you running on Cisco UCS servers with Cisco VICs, perchance? If not, could you send the output of "mpirun --mca btl_base_v

Re: [OMPI users] Help building/installing a working Open MPI 1.7.4 on OS X 10.9.2 with Free PGI Fortran

2014-03-24 Thread Jeff Squyres (jsquyres)
On Mar 24, 2014, at 6:34 PM, Matt Thompson wrote: > Sorry for the late reply. The answer is: No, 1.14.1 has not fixed the problem > (and indeed, that's what my Mac is running): > > (28) $ make install | & tee makeinstall.log > Making install in src > ../config/install-sh -c -d '/Users/fortran/

Re: [OMPI users] problem for multiple clusters using mpirun

2014-03-25 Thread Jeff Squyres (jsquyres)
Hamid Saeed wrote: > > Hello Jeff, > > > > Thanks for your cooperation. > > > > --mca btl_tcp_if_include br0 > > > > worked out of the box. > > > > The problem was from the network administrator. The machines on the network > > side we

Re: [OMPI users] coll_ml_priority in openmpi-1.7.5

2014-03-25 Thread Jeff Squyres (jsquyres)
Yes, Nathan has a few coll ml fixes queued up for 1.8. On Mar 24, 2014, at 10:11 PM, tmish...@jcity.maeda.co.jp wrote: > > > I ran our application using the final version of openmpi-1.7.5 again > with coll_ml_priority = 90. > > Then, coll/ml was actually activated and I got these error message

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Jeff Squyres (jsquyres)
Sorry -- we've been focusing on 1.7.5 and the impending 1.8 release; I probably won't be able to look at the v1.6 version in the next 2 weeks or so. On Mar 25, 2014, at 9:09 AM, Edgar Gabriel wrote: > yes, the patch has been submitted to the 1.6 branch for review, not sure > what the precise s

Re: [OMPI users] Help building/installing a working Open MPI 1.7.4 on OS X 10.9.2 with Free PGI Fortran

2014-03-25 Thread Jeff Squyres (jsquyres)
GS='-m64' FCFLAGS='-m64' FFLAGS='-m64' > --prefix=/Users/fortran/AutomakeBug/autobug14 | & tee configure.log > $ make V=1 install |& tee makeV1install.log > > So find attached the config.log, configure.log, and makeV1install.log which > shou

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-26 Thread Jeff Squyres (jsquyres)
On Mar 26, 2014, at 1:31 AM, Andreas Schäfer wrote: >> Even when "idle", MPI processes use all the CPU. I thought I remember >> someone saying that they will be low priority, and so not pose much of >> an obstacle to other uses of the CPU. > > well, if they're blocking in an MPI call, then they

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-26 Thread Jeff Squyres (jsquyres)
On Mar 26, 2014, at 6:45 AM, Andreas Schäfer wrote: >> 1. There is a fundamental difference between disabling >> hyperthreading in the BIOS at power-on time and simply running one >> MPI process per core. Disabling HT at power-on allocates more >> hardware resources to the remaining HT that is l

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Jeff Squyres (jsquyres)
On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)" wrote: > Yes, I noticed that I could not find --display-map in any of the man pages. > Intentional? Oops; nope. I'll ask Ralph to add it... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http:

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-28 Thread Jeff Squyres (jsquyres)
: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Thursday, March 27, 2014 7:01 PM > To: Open MPI Users > Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages) > > Oooh...it's Jeff's fault! > > Fwiw you can get eve

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-28 Thread Jeff Squyres (jsquyres)
On Mar 28, 2014, at 12:10 PM, Rob Latham wrote: > I also found a bad memcopy (i was taking the size of a pointer to a thing > instead of the size of the thing itself), but that only matters if ROMIO uses > extended generalized requests. I trust ticket #1159 is still accurate? > > https://svn.

Re: [OMPI users] problem for multiple clusters using mpirun

2014-04-07 Thread Jeff Squyres (jsquyres)
t; mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include br0 > --mca btl_tcp_port_min_v4 1 ./a.out > > Thanks again for the nice and effective suggestions. > > Regards. > > > > On Tue, Mar 25, 2014 at 1:27 PM, Jeff Squyres (jsquyres) > w

Re: [OMPI users] problem for multiple clusters using mpirun

2014-04-07 Thread Jeff Squyres (jsquyres)
a ticket on 1024 too. > for this purpose i wasn't able to communicate with other computers. > > > > > > On Mon, Apr 7, 2014 at 9:52 PM, Jeff Squyres (jsquyres) > wrote: > I was out on vacation / fully disconnected last week, and am just getting to > all t

Re: [OMPI users] Fortran MPI module and gfortran

2014-04-07 Thread Jeff Squyres (jsquyres)
On Mar 30, 2014, at 2:43 PM, W Spector wrote: > The mpi.mod file that is created from both the openmpi-1.7.4 and > openmpi-1.8rc1 tarballs does not seem to be generating interface blocks for > the Fortran API - whether the calls use choice buffers or not. Can you be a bit more specific -- are

Re: [OMPI users] openmpi query

2014-04-07 Thread Jeff Squyres (jsquyres)
Open MPI 1.4.3 is *ancient*. Please upgrade -- we just released Open MPI 1.8 last week. Also, please look at this FAQ entry -- it steps you through a lot of basic troubleshooting steps about getting basic MPI programs working. http://www.open-mpi.org/faq/?category=running#diagnose-multi-host

Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6

2014-04-07 Thread Jeff Squyres (jsquyres)
Per Dave's comment: note that running autogen.pl (or autogen.sh -- they're sym links to the same thing) is *only* necessary for SVN/hg/git checkouts of Open MPI. You should *not* run autogen.pl in an expanded Open MPI tarball unless you really know what you're doing (e.g., you made a change t

Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6

2014-04-07 Thread Jeff Squyres (jsquyres)
On Apr 7, 2014, at 6:47 PM, "Blosch, Edwin L" wrote: > Sorry for the confusion. I am not building OpenMPI from the SVN source. I > downloaded the 1.8 tarball and did configure, and that is what failed. I was > surprised that it didn't work on a vanilla Redhat Enterprise Linux 6, out of > th

Re: [OMPI users] openmpi query

2014-04-08 Thread Jeff Squyres (jsquyres)
You should ping the Rocks maintainers and ask them to upgrade. Open MPI 1.4.3 was released in September of 2010. On Apr 8, 2014, at 5:37 AM, Nisha Dhankher -M.Tech(CSE) wrote: > latest rocks 6.2 carry this version only > > > On Tue, Apr 8, 2014 at 3:49 AM, Jeff Squyr

Re: [OMPI users] Contributing Examples for Java Binding

2014-04-08 Thread Jeff Squyres (jsquyres)
If your examples are anything more than trivial code, we'll probably need a signed contribution agreement. This is a bit of a hassle, but it's an unfortunate necessity so that we can ensure that the Open MPI code base stays 100% open source / unencumbered by IP restrictions: http://www.ope

Re: [OMPI users] Simple Question regarding MPI Scatterv

2014-04-08 Thread Jeff Squyres (jsquyres)
In general, benchmarking is very hard. For example, you almost certainly want to do some "warmup" communications of the pattern that you're going to measure. This gets all communications setup, resources allocated, caches warmed up, etc. That is, there's generally some one-time setup that happ

Re: [OMPI users] Contributing Examples for Java Binding

2014-04-09 Thread Jeff Squyres (jsquyres)
hank you, > Saliya > > > On Tue, Apr 8, 2014 at 9:06 AM, Jeff Squyres (jsquyres) > wrote: > If your examples are anything more than trivial code, we'll probably need a > signed contribution agreement. This is a bit of a hassle, but it's an > unfortunate necessi

Re: [OMPI users] OpenMPI 1.8.0 + PGI 13.6 = undeclared variable __LDBL_MANT_DIG__

2014-04-11 Thread Jeff Squyres (jsquyres)
On Apr 9, 2014, at 8:47 PM, Filippo Spiga wrote: > I haven't solve this yet but I managed to move to code to be compatible woth > PGI 14.3. Open MPI 1.8 compiles perfectly with the latest PGI. > > In parallel I will push this issue to the PGI forum. FWIW: We've seen this kind of issue before

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-11 Thread Jeff Squyres (jsquyres)
Sorry for the delay in replying. Can you try upgrading to Open MPI 1.8, which was released last week? We refreshed the version of ROMIO that is included in OMPI 1.8 vs. 1.6. On Apr 8, 2014, at 6:49 PM, Daniel Milroy wrote: > Hello, > > Recently a couple of our users have experienced diffic

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Jeff Squyres (jsquyres)
This can also happen when you compile your application with one MPI implementation (e.g., Open MPI), but then mistakenly use the "mpirun" (or "mpiexec") from a different MPI implementation (e.g., MPICH). On Apr 14, 2014, at 2:32 PM, Djordje Romanic wrote: > I compiled it with: x86_64 Linux, g

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Jeff Squyres (jsquyres)
If you didn't use Open MPI, then this is the wrong mailing list for you. :-) (this is the Open MPI users' support mailing list) On Apr 14, 2014, at 2:58 PM, Djordje Romanic wrote: > I didn't use OpenMPI. > > > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsqu

Re: [OMPI users] openmpi-1.7.4/1.8 .0 problem with intel/mpi_sizeof

2014-04-14 Thread Jeff Squyres (jsquyres)
Yes, this is a bug. Doh! Looks like we fixed it for one case, but missed another case. :-( I've filed https://svn.open-mpi.org/trac/ompi/ticket/4519, and will fix this shortly. On Apr 14, 2014, at 4:11 AM, Luis Kornblueh wrote: > Dear all, > > the attached mympi_test.f90 does not com

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Jeff Squyres (jsquyres)
picc > > I never built WRF here (but other people here use it). > Which input do you provide to the command that generates the configure > script that you sent before? > Maybe the full command line will shed some light on the problem. > > > I hope this helps, > Gus Co

Re: [OMPI users] Cygwin compilation problems for openmpi-1.8

2014-04-15 Thread Jeff Squyres (jsquyres)
On Apr 15, 2014, at 8:35 AM, Marco Atzeri wrote: > on 64bit 1.7.5, > as Symantec Endpoint protections, just decided > that a portion of 32bit MPI is a Trojan... It's the infamous MPI trojan. We take over your computer and use it to help cure cancer. :p -- Jeff Squyres jsquy...@cisco.com Fo

Re: [OMPI users] Where is the error? (MPI program in fortran)

2014-04-17 Thread Jeff Squyres (jsquyres)
Sounds like you're freeing memory that does not belong to you. Or you have some kind of memory corruption somehow. On Apr 17, 2014, at 2:01 PM, Oscar Mojica wrote: > Hello guys > > I used the command > > ulimit -s unlimited > > and got > > stack size (kbytes, -s) unlimited

Re: [OMPI users] Problem regarding the use of openib module and memory registration

2014-04-22 Thread Jeff Squyres (jsquyres)
See this FAQ entry: http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem On Apr 22, 2014, at 2:38 PM, Amin Hassani wrote: > When I want to use OpenIB module of OpenMPI (thorugh -mca btl > sm,self,openib), I keep getting the message that configuration only allow > registering

Re: [OMPI users] IMB Sendrecv hangs with OpenMPI 1.6.5 and XRC

2014-04-23 Thread Jeff Squyres (jsquyres)
A few suggestions: - Try using Open MPI 1.8.1. It's the newest release, and has many improvements since the 1.6.x series. - Try using "--mca btl openib,sm,self" (in both v1.6.x and v1.8.x). This allows Open MPI to use shared memory to communicate between processes on the same server, which c

Re: [OMPI users] trying to use personal copy of 1.7.4

2014-04-24 Thread Jeff Squyres (jsquyres)
On Mar 13, 2014, at 3:15 PM, Ross Boylan wrote: > The motivation was > http://www.stats.uwo.ca/faculty/yu/Rmpi/changelogs.htm notes > -- > 2007-10-24, version 0.5-5: > > dlopen has been used to load libmpi.so explicitly. This is mainly useful > for Rmpi under Open

Re: [OMPI users] mpi.isend still not working (was trying to use personal copy of 1.7.4--solved)

2014-04-24 Thread Jeff Squyres (jsquyres)
On Apr 23, 2014, at 4:45 PM, Ross Boylan wrote: >> is OK. So, if any nonblocking calls are used, one must use mpi.test or >> mpi.wait to check if they are complete before trying any blocking calls. That is also correct -- it's MPI semantics (communications initiated by MPI_Isend / MPI_Irecv mus

Re: [OMPI users] Connection timed out on TCP

2014-04-28 Thread Jeff Squyres (jsquyres)
In principle, there's nothing wrong with using ib0 interfaces for TCP MPI communication, but it does raise the question of why you're using TCP when you have InfiniBand available...? Aside from that, can you send all the info listed here: http://www.open-mpi.org/community/help/ On Apr 28,

Re: [OMPI users] OpenMPI 1.8 and PGI compilers

2014-04-28 Thread Jeff Squyres (jsquyres)
Brian: Can you report this bug to PGI and see what they say? On Apr 27, 2014, at 2:15 PM, "Hjelm, Nathan T" wrote: > I see nothing invalid about that line. It is setting a struct scif_portID > from another struct scif_portID which is allowed in C99. The error might be > misleading or a compil

Re: [OMPI users] Connection timed out on TCP

2014-04-29 Thread Jeff Squyres (jsquyres)
On Apr 29, 2014, at 4:28 PM, Vince Grimes wrote: > I realize it is no longer in the history of replies for this message, but the > reason I am trying to use tcp instead of Infiniband is because: > > We are using an in-house program called ScalIT that performs operations on > very large sparse

Re: [OMPI users] Error in openmpi-1.8.2a1r31556 with Sun C 5.12

2014-04-30 Thread Jeff Squyres (jsquyres)
Gah. I thought we had this Fortran stuff finally correct. :-\ Let me take this off-list and see if there's something still not quite right, or if the Sun fortran compiler isn't doing something right. On Apr 30, 2014, at 10:40 AM, Siegmar Gross wrote: > Hi, > > I tried to install openmpi-1

Re: [OMPI users] MPI File Open does not work

2014-05-06 Thread Jeff Squyres (jsquyres)
The thread support in the 1.6 series is not very good. You might try: - Upgrading to 1.6.5 - Or better yet, upgrading to 1.8.1 On May 6, 2014, at 7:24 AM, Imran Ali wrote: > I get the following error when I try to run the following python code > > import mpi4py.MPI as MPI > comm = MPI.COMM_

Re: [OMPI users] MPI File Open does not work

2014-05-06 Thread Jeff Squyres (jsquyres)
On May 6, 2014, at 9:32 AM, Imran Ali wrote: > I will attempt that than. I read at > > http://www.open-mpi.org/faq/?category=building#install-overwrite > > that I should completely uninstall my previous version. Yes, that is best. OR: you can install into a whole separate tree and ignore th

Re: [OMPI users] MPI File Open does not work

2014-05-06 Thread Jeff Squyres (jsquyres)
On May 6, 2014, at 9:40 AM, Imran Ali wrote: > My install was in my user directory (i.e $HOME). I managed to locate the > source directory and successfully run make uninstall. FWIW, I usually install Open MPI into its own subdir. E.g., $HOME/installs/openmpi-x.y.z. Then if I don't want that

Re: [OMPI users] users Digest, Vol 2879, Issue 1

2014-05-06 Thread Jeff Squyres (jsquyres)
Are you using TCP as the MPI transport? If so, another thing to try is to limit the IP interfaces that MPI uses for its traffic to see if there's some kind of problem with specific networks. For example: mpirun --mca btl_tcp_if_include eth0 ... If that works, then try adding in any/all othe

Re: [OMPI users] ROMIO bug reading darrays

2014-05-07 Thread Jeff Squyres (jsquyres)
On May 7, 2014, at 4:10 PM, Richard Shaw wrote: > Thanks Rob. I'll keep track of it over there. How often do updated versions > of ROMIO get pulled over from MPICH into OpenMPI? "Periodically". Hopefully, the fix will be small and we can just pull that one fix down to OMPI. > On a slightly re

Re: [OMPI users] Intercommunicators Collective Communciation

2014-05-09 Thread Jeff Squyres (jsquyres)
On May 9, 2014, at 7:56 PM, Spenser Gilliland wrote: > I'm having some trouble understanding Intercommunicators with > Collective Communication. Is there a collective routine to express a > transfer from all left process to all right processes? or vice versa? The intercomm collectives are all b

Re: [OMPI users] Intercommunicators Collective Communciation

2014-05-10 Thread Jeff Squyres (jsquyres)
On May 9, 2014, at 8:34 PM, Spenser Gilliland wrote: > Thanks for the quick response. I'm having alot of fun learning MPI and this > mailing list has been invaluable. > > So, If I do a scatter on an inter communicator will this use all left > process to scatter on all right processes? Yes

Re: [OMPI users] Question about scheduler support

2014-05-14 Thread Jeff Squyres (jsquyres)
Here's a bit of our rational, from the README file: Note that for many of Open MPI's --with- options, Open MPI will, by default, search for header files and/or libraries for . If the relevant files are found, Open MPI will built support for ; if they are not found, Open MPI will s

Re: [OMPI users] Question about scheduler support

2014-05-14 Thread Jeff Squyres (jsquyres)
On May 14, 2014, at 6:09 PM, Ralph Castain wrote: > FWIW: I believe we no longer build the slurm support by default, though I'd > have to check to be sure. The intent is definitely not to do so. The srun-based support builds by default. I like it that way. :-) PMI-based support is a differen

Re: [OMPI users] Question about scheduler support

2014-05-15 Thread Jeff Squyres (jsquyres)
t;>> use. So we wind up building a bunch of useless modules. >>>> >>>> >>>> On May 14, 2014, at 3:09 PM, Ralph Castain wrote: >>>> >>>>> FWIW: I believe we no longer build the slurm support by default, though >>>>> I&

Re: [OMPI users] Question about scheduler support

2014-05-15 Thread Jeff Squyres (jsquyres)
These are all good points -- thanks for the feedback. Just to be clear: my point about the menu system was to generate file that could be used for subsequent installs, very specifically targeted at those who want/need scriptable installations. One possible scenario could be: you download OMPI

Re: [OMPI users] unknown interface on openmpi-1.8.2a1r31742

2014-05-15 Thread Jeff Squyres (jsquyres)
This bug should be fixed in tonight's tarball, BTW. On May 15, 2014, at 9:19 AM, Ralph Castain wrote: > It is an unrelated bug introduced by a different commit - causing mpirun to > segfault upon termination. The fact that you got the hostname to run > indicates that this original fix works,

Re: [OMPI users] Question about scheduler support

2014-05-15 Thread Jeff Squyres (jsquyres)
On May 15, 2014, at 6:14 PM, Fabricio Cannini wrote: > Alright, but now I'm curious as to why you decided against it. > Could please elaborate on it a bit ? OMPI has a long, deep history with the GNU Autotools. It's a very long, complicated story, but the high points are: 1. The GNU Autotools

Re: [OMPI users] Question about scheduler support

2014-05-16 Thread Jeff Squyres (jsquyres)
On May 15, 2014, at 8:00 PM, Fabricio Cannini wrote: >> Nobody is disagreeing that one could find a way to make CMake work - all we >> are saying is that (a) CMake has issues too, just like autotools, and (b) we >> have yet to see a compelling reason to undertake the transition...which >> woul

Re: [OMPI users] openmpi configuration error?

2014-05-17 Thread Jeff Squyres (jsquyres)
Ditto -- Lmod looks pretty cool. Thanks for the heads up. On May 16, 2014, at 6:23 PM, Douglas L Reeder wrote: > Maxime, > > I was unaware of Lmod. Thanks for bringing it to my attention. > > Doug > On May 16, 2014, at 4:07 PM, Maxime Boissonneault > wrote: > >> Instead of using the outda

Re: [OMPI users] How to run Open MPI over TCP (Ethernet)

2014-05-22 Thread Jeff Squyres (jsquyres)
Can you send the output of ifconfig on both compute-0-15.local and compute-0-16.local? On May 22, 2014, at 3:30 AM, Bibrak Qamar wrote: > Hi, > > I am facing problem in running Open MPI using TCP (on 1G Ethernet). In > practice the bandwidth must not exceed 1000 Mbps but for some data points

Re: [OMPI users] False positive from valgrind in sec_basic.c

2014-05-22 Thread Jeff Squyres (jsquyres)
Would a better solution be something like: char default_credential[8] = "12345"; char *bar = strdup(default_credential) ? On May 22, 2014, at 12:52 AM, George Bosilca wrote: > This is more subtle that described here. It's a vectorization problem > and frankly it should appear on all loop-base

Re: [OMPI users] MPI_Finalize() maintains load at 100%.

2014-05-24 Thread Jeff Squyres (jsquyres)
Sorry to jump in late on this thread, but here's my thoughts: 1. Your initial email said "threads", not "processes". I assume you actually meant "processes" (having multiple threads calls MPI_FINALIZE is erroneous). 2. Periodically over the years, we have gotten the infrequent request to suppo

Re: [OMPI users] How to run Open MPI over TCP (Ethernet)

2014-05-24 Thread Jeff Squyres (jsquyres)
I am sorry for the delay in replying; this week got a bit crazy on me. I'm guessing that Open MPI is striping across both your eth0 and ib0 interfaces. You can limit which interfaces it uses with the btl_tcp_if_include MCA param. For example: # Just use eth0 mpirun --mca btl tcp,sm,sel

Re: [OMPI users] can't preload binary to remote machine

2014-05-24 Thread Jeff Squyres (jsquyres)
Are you able to upgrade to Open MPI 1.8.x, perchance? On May 20, 2014, at 9:28 AM, "Cordone, Guthrie" wrote: > Hello, > > I have two linux machines, each running Open MPI 1.6.5. I want to use the > preload binary command in an appfile to execute a binary from the host on > both the node and

Re: [OMPI users] configure openmpi 1.8.1 with intel compiler

2014-05-28 Thread Jeff Squyres (jsquyres)
Your configure statement looks fine (note that you don't need the F77=ifort token, but it's harmless -- the FC=ifort token is the important one). Can you send all the information listed here: http://www.open-mpi.org/community/help/ On May 28, 2014, at 2:15 AM, Lorenzo Donà wrote: > Pleas

Re: [OMPI users] MPI installation problem

2014-05-30 Thread Jeff Squyres (jsquyres)
These messages are normal for RPM. Keep in mind that you're installing the source RPM -- not a binary RPM. Most people use the source RPM in an rpmbuild command (to build a binary RPM for their environment), not installing directly. On May 30, 2014, at 7:43 AM, Fernando Cruz wrote: > Dear Al

Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-06-03 Thread Jeff Squyres (jsquyres)
le is still very broken. And once again I am having to >> modify my local version. (+1 for open source!) Will it be fixed in v1.8.2? >> >> Configure is using the "use-mpi-tkr" version on my system. I can see that >> the "use-mpi-f08" version is much bette

Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-06-03 Thread Jeff Squyres (jsquyres)
Ok. I think most were fixed after you reported them last year, but a few new MPI-3 functions were added after that, and they accidentally had "ierr" instead of "ierror". On Jun 3, 2014, at 11:47 AM, W Spector wrote: > Jeff Squyres wrote: > > Did you find any other places where we accidentall

Re: [OMPI users] OpenMPI Compilation Error

2014-06-05 Thread Jeff Squyres (jsquyres)
George and I are together at the MPI Forum this week -- we just looked at this in more detail; it looks like this is a more pervasive problem. Let us look at this a bit more... On Jun 5, 2014, at 10:37 AM, George Bosilca wrote: > Alan, > > I think we forgot to cleanup after a merge and as a

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 5:41 PM, Vineet Rawat wrote: > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. They > have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E > PCIe Ethern

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 6:36 PM, Vineet Rawat wrote: > No, we only included what seemed necessary (from ldd output and experience on > other clusters). The only things in my /lib/openmpi are > libompi_dbg_msgq*. Is that what you're referring to? In /lib for > 12.8.1 (ignoring the VampirTrace libs)

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 7:00 PM, Vineet Rawat wrote: > We actually do ship the /share and /etc directories. We set > OPAL_PREFIX to a sub-directory of our installation and make sure those things > are in our PATH/LD_LIBRARY_PATH. > > I can try adding the additional shared libs but it doesn't sound

Re: [OMPI users] openib segfaults with Torque

2014-06-09 Thread Jeff Squyres (jsquyres)
I seem to recall that you have an IB-based cluster, right? >From a *very quick* glance at the code, it looks like this might be a simple >incorrect-finalization issue. That is: - you run the job on a single server - openib disqualifies itself because you're running on a single server - openib t

Re: [OMPI users] intermittent segfaults with openib on ring_c.c

2014-06-09 Thread Jeff Squyres (jsquyres)
I'm digging out from mail backlog from being at the MPI Forum last week... Yes, from looking at the stack traces, it's segv'ing inside the memory allocator, which typically means some other memory error occurred before this. I.e., this particular segv is a symptom of the problem, not the actual

Re: [OMPI users] Bug in OpenMPI-1.8.1: missing routines mpi_win_allocate_shared, mpi_win_shared_query called from Ftn95-code

2014-06-09 Thread Jeff Squyres (jsquyres)
Oops. Looks like we missed these in the Fortran interfaces. I'll file a bug; we'll get this fixed in OMPI 1.8.2. Many thanks for reporting this. On Jun 5, 2014, at 5:41 AM, michael.rach...@dlr.de wrote: > Dear developers of OpenMPI, > > I found that when building an executable from a Fortr

Re: [OMPI users] openib segfaults with Torque

2014-06-10 Thread Jeff Squyres (jsquyres)
Greg: Can you run with "--mca btl_base_verbose 100" on your debug build so that we can get some additional output to see why UDCM is failing to setup properly? On Jun 10, 2014, at 10:25 AM, Nathan Hjelm wrote: > On Tue, Jun 10, 2014 at 12:10:28AM +, Jeff Squyres (jsquyres

Re: [OMPI users] openib segfaults with Torque

2014-06-11 Thread Jeff Squyres (jsquyres)
: >>> Jeff/Nathan, >>> >>> I ran the following with my debug build of OpenMPI 1.8.1 - after opening a >>> terminal on a compute node with "qsub -l nodes 2 -I": >>> >>> mpirun -mca btl openib,self -mca btl_base_verbose 100 -np 2 >&g

Re: [OMPI users] openib segfaults with Torque

2014-06-11 Thread Jeff Squyres (jsquyres)
008-February/006916.html >> > >> >Greg, if the suggestion from the Torque users doesn't resolve your >> > issue ( >> >"...adding the following line 'ulimit -l unlimited' to pbs_mom and >> >restarting pbs_mom." ) doesn&#x

Re: [OMPI users] Question on licensing

2014-06-17 Thread Jeff Squyres (jsquyres)
Open MPI is distributed under the modified BSD license. Here’s a link to the v1.8 LICENSE file: https://svn.open-mpi.org/trac/ompi/browser/branches/v1.8/LICENSE As long as you abide by the terms of that license, you are fine. On Jun 17, 2014, at 4:41 AM, Victor Vysotskiy wrote: > Dear

Re: [OMPI users] Program abortion at a simple MPI_Get Programm

2014-06-17 Thread Jeff Squyres (jsquyres)
I'll let Nathan/others comment on the correctness of your program. What version of Open MPI are you using? Be sure to use the latest to get the most correct one-sided implementation. Also, as one of the prior LAM/MPI developers, I must plead with you to stop using LAM/MPI. We abandoned it man

Re: [OMPI users] affinity issues under cpuset torque 1.8.1

2014-06-24 Thread Jeff Squyres (jsquyres)
Brock -- Can you run with "ompi_info --all"? With "--param all all", ompi_info in v1.8.x is defaulting to only showing level 1 MCA params. It's showing you all possible components and variables, but only level 1. Or you could also use "--level 9" to show all 9 levels. Here's the relevant se

Re: [OMPI users] openmpi linking problem

2014-06-26 Thread Jeff Squyres (jsquyres)
This doesn't sound like a linking problem; this sounds like there's an error in your application that is causing it to abort before completing. On Jun 25, 2014, at 12:19 PM, Sergii Veremieiev wrote: > Dear Sir/Madam, > > I'm trying to run a parallel finite element analysis 64-bit code on my >

Re: [OMPI users] Problem mpi

2014-06-26 Thread Jeff Squyres (jsquyres)
Sounds like you have a problem with the physical layer of your InfiniBand. You should run layer 0 diagnostics and/or contact your IB vendor for assistance. On Jun 24, 2014, at 4:48 AM, Diego Saúl Carrió Carrió wrote: > Dear all, > > I have problems for a long time related with mpirun. When

Re: [OMPI users] poor performance using the openib btl

2014-06-26 Thread Jeff Squyres (jsquyres)
Just curious -- if you run standard ping-pong kinds of MPI benchmarks with the same kind of mpirun command line that you run your application, do you see the expected level of performance? (i.e., verification that you're using the low latency transport, etc.) On Jun 25, 2014, at 9:52 AM, Fisc

Re: [OMPI users] Problem moving from 1.4 to 1.6

2014-06-28 Thread Jeff Squyres (jsquyres)
You might well be able to: mpirun --mca btl ^openib,udapl ... Which excludes both openib and udapl (both of which used the same librdmacm). If this doesn't solve the problem, then please send the info Ralph asked for, and we'll dig deeper... On Jun 27, 2014, at 3:41 PM, Ralph Castain wrote

Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-22 Thread Jeff Squyres (jsquyres)
Can you try upgrading to OMPI 1.6.5? 1.6.5 has *many* bug fixes compared to 1.5.4. A little background... Open MPI is developed in terms of release version pairs: "1.odd" are feature releases. We add new (and remove old) features, etc. We do a lot of testing, but this is all done in lab/tes

Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-22 Thread Jeff Squyres (jsquyres)
Hyperthreading is pretty great for non-HPC applications, which is why Intel makes it. But hyperthreading *generally* does not help HPC application performance. You're basically halving several on-chip resources / queues / pipelines, and that can hurt for performance-hungry HPC applications. T

Re: [OMPI users] bus error with openmpi-1.8.2rc2 on Solaris 10 Sparc

2014-07-25 Thread Jeff Squyres (jsquyres)
Siegmar -- This looks like the typical type of alignment error that we used to see when testing regularly on SPARC. :-\ It looks like the error was happening in mca_db_hash.so. Could you get a stack trace / file+line number where it was failing in mca_db_hash? (i.e., the actual bad code wil

Re: [OMPI users] SIGSEGV for Java program in openmpi-1.8.2rc2 on Solaris 10

2014-07-25 Thread Jeff Squyres (jsquyres)
That's quite odd that it only happens for Java programs -- it should happen for *all* programs, based on the stack trace you've shown. Can you print the value of the lds struct where the error occurs? On Jul 25, 2014, at 2:29 AM, Siegmar Gross wrote: > Hi, > > I have installed openmpi-1.8.2

  1   2   3   4   5   6   7   8   9   10   >