Re: [OMPI devel] [OMPI bugs] [Open MPI] #3489: Move r27954 to v1.7 branch

2013-01-28 Thread Ralph Castain
On Jan 28, 2013, at 7:30 PM, George Bosilca wrote: > Ralph, > > What if I say it wasn't a "stale" option nobody cares about. You just > removed one of the critical pieces of the configury, completely > disabling the work of other people. Well, I would say that (a) the code that these options e

Re: [OMPI devel] [OMPI bugs] [Open MPI] #3489: Move r27954 to v1.7 branch

2013-01-28 Thread George Bosilca
Ralph, What if I say it wasn't a "stale" option nobody cares about. You just removed one of the critical pieces of the configury, completely disabling the work of other people. I am absolutely sorry that I didn't make it in the 27 minutes you generously provided for comments. Removing from the tr

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Paul Hargrove
Ralph and Nathan, As I said, the results I see fail to match the actual ALPS header locations on both CLE4 and CLE5 systems at NERSC. However, the CLE4 system "just works" because the actual location (/usr/include) gets searched no matter what value configure picks for $orte_check_alps_dir. I sus

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Ralph Castain
Like I said, I didn't write this code - all I can say for certain is that it gets the right answer on the LANL Crays. I'll talk to Nathan (the author) about it tomorrow. On Jan 28, 2013, at 6:23 PM, Paul Hargrove wrote: > Ralph writes > ?? It looks correct to me - if with_alps is "yes", then n

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Ralph Castain
On Jan 28, 2013, at 6:23 PM, George Bosilca wrote: > What Paul is saying is that there is a path mismatch between the two > cases. Few lines above using_cle5_install is only set to yes if > /usr/lib/alps/libalps.a exist. Then in the snippet pasted in Paul's > email if using_cle5_install is yes t

Re: [OMPI devel] 1.7rc6 build failure: bogus errmgr code

2013-01-28 Thread Ralph Castain
LOL - yeah, I've heard that term :-) I removed the options. Thanks! On Jan 28, 2013, at 6:18 PM, Paul Hargrove wrote: > You might say that I like to "push all the buttons and see which ones go > boom". > See the commit message for r8099 (which I don't imagine Jeff or Brian ever > thought I'd

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Paul Hargrove
Ralph writes > ?? It looks correct to me - if with_alps is "yes", then no path was given > and we have to look at a default location. If it isn't yes, then a path was > given and we use it. > Am I missing something? Maybe *I* am the one missing something, but the way I read it the following defa

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread George Bosilca
What Paul is saying is that there is a path mismatch between the two cases. Few lines above using_cle5_install is only set to yes if /usr/lib/alps/libalps.a exist. Then in the snippet pasted in Paul's email if using_cle5_install is yes then you set the orte_check_alps_libdir to something in /opt/cr

Re: [OMPI devel] 1.7rc6 build failure: bogus errmgr code

2013-01-28 Thread Paul Hargrove
You might say that I like to "push all the buttons and see which ones go boom". See the commit message for r8099 (which I don't imagine Jeff or Brian ever thought I'd read). -Paul On Mon, Jan 28, 2013 at 5:43 PM, Ralph Castain wrote: > Yes, we

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Ralph Castain
On Jan 28, 2013, at 6:10 PM, Paul Hargrove wrote: > The following 2 fragment from config/orte_check_alps.m4 appear to be > contradictory. > By that I mean the first appears to mean that "--with-alps" with no argument > means /opt/cray/alps/default/... for CLE5 and /usr/... for CLE4, while the

[OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Paul Hargrove
The following 2 fragment from config/orte_check_alps.m4 appear to be contradictory. By that I mean the first appears to mean that "--with-alps" with no argument means /opt/cray/alps/default/... for CLE5 and /usr/... for CLE4, while the second fragment appears to be doing the opposite:

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Steve Wise
On 1/28/2013 7:32 PM, Ralph Castain wrote: Out of curiosity, could you tell us how you configured OMPI? ./configure --enable-debug --enable-mpirun-prefix-by-default --prefix=/usr/mpi/gcc/openmpi-1.6.4rc2-dbg On Jan 28, 2013, at 12:46 PM, Steve Wise wrote: On 1/28/2013 2:04 PM, Ralph C

Re: [OMPI devel] 1.7rc6 build failure: bogus errmgr code

2013-01-28 Thread Ralph Castain
Yes, we need to make it absolutely clear that c/r is no longer supported - I'll remove that configure option. Thanks Ralph On Jan 28, 2013, at 5:38 PM, Paul Hargrove wrote: > When configured using --with-ft=cr on linux/x86 I see the following build > failure: > > Making all in mca/errmgr > m

[OMPI devel] 1.7rc6 build failure: bogus errmgr code

2013-01-28 Thread Paul Hargrove
When configured using --with-ft=cr on linux/x86 I see the following build failure: Making all in mca/errmgr make[2]: Entering directory `/home/pcp1/phargrov/OMPI/openmpi-1.7rc6-linux-x86-blcr/BLD/orte/mca/errmgr' CC base/errmgr_base_close.lo CC base/errmgr_base_select.lo CC

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Ralph Castain
Out of curiosity, could you tell us how you configured OMPI? On Jan 28, 2013, at 12:46 PM, Steve Wise wrote: > On 1/28/2013 2:04 PM, Ralph Castain wrote: >> On Jan 28, 2013, at 11:55 AM, Steve Wise wrote: >> >>> Do you know if the rdmacm CPC is really being used for your connection >>> setup

Re: [OMPI devel] Looking for a replacement call for repeated call to MPI_IPROBE

2013-01-28 Thread Jeff Squyres (jsquyres)
Is there a reason you're using buffered sends? They're generally pretty evil: http://blogs.cisco.com/performance/top-10-reasons-why-buffered-sends-are-evil/ FWIW, you can probably install Open MPI 1.6.3 yourself -- you can just install it under $HOME, or some other directory that is avail

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread Jeff Squyres (jsquyres)
You're basically telling your build system to use a C++ compiler as the linker when creating libtorque. This probably does more-or-less what I suggested: rpath'ing in whatever dependencies you need such that when we link against libtorque, all of the (C++) dependencies that you need are automat

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread David Beer
On Mon, Jan 28, 2013 at 12:14 PM, Barrett, Brian W wrote: > On 1/28/13 11:54 AM, "David Beer" wrote: > > checking for tm_init in -ltorque... no > configure: error: TM support requested but not found. Aborting > > Oddly enough, if you have already configured with an older version of > TORQUE, you

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Steve Wise
On 1/28/2013 2:04 PM, Ralph Castain wrote: On Jan 28, 2013, at 11:55 AM, Steve Wise wrote: Do you know if the rdmacm CPC is really being used for your connection setup (vs other CPCs supported by IB)? Cuz iwarp only supports rdmacm. Maybe that's the difference? Dunno for certain, but I ex

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-28 Thread Paul Hargrove
I will be happy to retest on both the XC30 and XE6 at NERSC from a nightly tarball with the fixes. Please give me a heads up when that is available. -Paul On Mon, Jan 28, 2013 at 7:52 AM, Ralph Castain wrote: > The key was to --enable-static --disable-shared. That's the only way to > generate

Re: [OMPI devel] 1.6.4rc2 released

2013-01-28 Thread Paul Hargrove
I am pleased to say that 1.6.4rc2 builds and runs (single node, sm btl) on my BSD menagerie: freebsd6-amd64 freebsd7-amd64 freebsd8-amd64 freebsd8-i386 freebsd9-amd64 freebsd9-i386 netbsd6-amd64 netbsd6-i386 openbsd5-amd64 openbsd5-i386 The {Free,Net,Open}BSD platform

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread Jeff Squyres (jsquyres)
I'll +1 what Brian said: we *really* don't want to have to link Open MPI with a C++ compiler. Can't you rpath in whatever support libraries you need (e.g., the g++ libraries with the cxx_personality symbol), such that when we -ltorque, it just pulls in whatever other dependencies it needs? (I'

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Ralph Castain
On Jan 28, 2013, at 11:55 AM, Steve Wise wrote: > Do you know if the rdmacm CPC is really being used for your connection setup > (vs other CPCs supported by IB)? Cuz iwarp only supports rdmacm. Maybe > that's the difference? Dunno for certain, but I expect it is using the OOB cm since I did

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Steve Wise
Do you know if the rdmacm CPC is really being used for your connection setup (vs other CPCs supported by IB)? Cuz iwarp only supports rdmacm. Maybe that's the difference? Steve. On 1/28/2013 1:47 PM, Ralph Castain wrote: Nope - still works just fine. I didn't receive that warning at all, an

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Ralph Castain
Nope - still works just fine. I didn't receive that warning at all, and it ran to completion without problem. I suspect the problem is that the system I can use just isn't configured like yours, and so I can't trigger the problem. Afraid I can't be of help after all... :-( On Jan 28, 2013, at

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Steve Wise
On 1/28/2013 12:48 PM, Ralph Castain wrote: Hmmm...afraid I cannot replicate this using the current state of the 1.6 branch (which is the 1.6.4rcN) on the only IB-based cluster I can access. Can you try it with a 1.6.4 tarball and see if you still see the problem? Could be someone already fixe

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread Barrett, Brian W
On 1/28/13 11:54 AM, "David Beer" wrote: > checking for tm_init in -ltorque... no > configure: error: TM support requested but not found. Aborting > > Oddly enough, if you have already configured with an older version of TORQUE, > you can build open-mpi with TORQUE 4.2 installed, so it can find

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread Ralph Castain
I don't see anything in the config script that checks for gcc - you might take a look at it to check. It's in config/orte_check_tm.m4 on our developer's trunk On Jan 28, 2013, at 10:54 AM, David Beer wrote: > > On Mon, Jan 28, 2013 at 10:54 AM, Barrett, Brian W wrote: > > We assume that we c

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread David Beer
On Mon, Jan 28, 2013 at 10:54 AM, Barrett, Brian W wrote: > > > We assume that we can link lib torque into a C application (if this is a > problem for you, it's a huge deal breaker for us, since OMPI is a C > library). What does config.log say when checking for tm_init? > > Brian > > Brian, libto

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Ralph Castain
Hmmm...afraid I cannot replicate this using the current state of the 1.6 branch (which is the 1.6.4rcN) on the only IB-based cluster I can access. Can you try it with a 1.6.4 tarball and see if you still see the problem? Could be someone already fixed it. On Jan 28, 2013, at 10:03 AM, Steve Wi

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Steve Wise
On 1/28/2013 11:48 AM, Ralph Castain wrote: On Jan 28, 2013, at 9:12 AM, Steve Wise wrote: On 1/25/2013 12:19 PM, Steve Wise wrote: Hello, I'm tracking an issue I see in openmpi-1.6.3. Running this command on my chelsio iwarp/rdma setup causes a seg fault every time: /usr/mpi/gcc/openmpi-

Re: [OMPI devel] [EXTERNAL] Open MPI Configure Script

2013-01-28 Thread Barrett, Brian W
On 1/28/13 10:50 AM, "David Beer" wrote: > By way of introduction, I'm a TORQUE developer and I probably should've joined > this list - even if only to keep myself informed - years ago. > > At any rate, we're in the process of changing TORQUE so that it compiles using > g++ instead of gcc. We're

[OMPI devel] Open MPI Configure Script

2013-01-28 Thread David Beer
All, By way of introduction, I'm a TORQUE developer and I probably should've joined this list - even if only to keep myself informed - years ago. At any rate, we're in the process of changing TORQUE so that it compiles using g++ instead of gcc. We're starting to use some C++ constructs to make ou

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Ralph Castain
On Jan 28, 2013, at 9:12 AM, Steve Wise wrote: > On 1/25/2013 12:19 PM, Steve Wise wrote: >> Hello, >> >> I'm tracking an issue I see in openmpi-1.6.3. Running this command on my >> chelsio iwarp/rdma setup causes a seg fault every time: >> >> /usr/mpi/gcc/openmpi-1.6.3-dbg/bin/mpirun --np 2

Re: [OMPI devel] Looking for a replacement call for repeated call to MPI_IPROBE

2013-01-28 Thread Jeremy McCaslin
Thank you for the feedback. I actually just changed the repeated probing for a message to a blocking MPI_RECV, as the processor waiting to receive does nothing but repeatedly probe until the message is there anyway. This also works, and it makes more sense to do it this way. However, this did no

Re: [OMPI devel] openib unloaded before last mem dereg

2013-01-28 Thread Steve Wise
On 1/25/2013 12:19 PM, Steve Wise wrote: Hello, I'm tracking an issue I see in openmpi-1.6.3. Running this command on my chelsio iwarp/rdma setup causes a seg fault every time: /usr/mpi/gcc/openmpi-1.6.3-dbg/bin/mpirun --np 2 --host hpc-hn1,hpc-cn2 --mca btl openib,sm,self --mca btl_openib

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-28 Thread Ralph Castain
The key was to --enable-static --disable-shared. That's the only way to generate the problem. Brian was already aware of it and fixed it this weekend. I tested the fix and it works fine. Waiting for Jeff to review it before committing to the trunk. On Jan 28, 2013, at 7:45 AM, Nathan Hjelm wr

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-28 Thread Nathan Hjelm
Try building static. Lots of errors due to missing libraries in libs_static. -Nathan On Fri, Jan 25, 2013 at 04:09:16PM -0800, Ralph Castain wrote: > FWIW: I can build it fine without setting any of the CC... flags on LANL's > Cray XE6, and mpicc worked just fine for me once built that way. > >

Re: [OMPI devel] New ARM patch

2013-01-28 Thread Leif Lindholm
On 26/01/13 00:05, Jeff Squyres (jsquyres) wrote: Here's what I have done: 1. Committed your patch to v1.6. George's patch was not committed to v1.6. Many thanks. 2. I opened https://svn.open-mpi.org/trac/ompi/ticket/3481 to track your proposal of re-implementing/revamping the ARM ASM code.