[OMPI devel] confusing output when no c++ compiler

2015-02-02 Thread Paul Hargrove
The output below occurred testing Jeff's no-embedded-libltdl tarball, but I am assuming in quite likely the same is true on the trunk. The "issue" is that I am told by configure that "C and C++ compilers are not link compatible". However, it appears I just don't have a C++ compiler at all!! I am

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
; > On Jan 30, 2015, at 5:14 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > > > Shame on me for not running "make check". > > > > Fixing... > > > > > >> On Jan 30, 2015, at 4:58 PM, Paul Hargrove <phhargr.

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
n). Now featuring 100% fewer "make check" > failures. > > http://www.open-mpi.org/~jsquyres/unofficial/ > > > > On Jan 30, 2015, at 5:14 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > > > Shame on me for not running "

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
2015 at 1:29 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > On Jan 30, 2015, at 2:46 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > > If I had new enough autotools to autogen on this old system then I > wouldn't have asked about libltdl from libtool-1.4. S

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-29 Thread Paul Hargrove
Jeff, If I understand one is (or will be soon) expected to have libtool-dev(el) installed on the build system, even if one is not a OMPI developer. How does this plan to cease embedding libltdl align with the fact that autogen.pl currently applies patches to the parts of the generated configure

Re: [OMPI devel] [Open MPI Announce] Open MPI 1.8.4 released

2014-12-20 Thread Paul Hargrove
Sorry to rain on the parade, but SGI UV is still broken by default. I reported this as present in 1.8.4rc5 and Nathan had claimed to be working on it. A reminder that all it takes is a 1-line change in ompi/mca/btl/vader/configure.m4 to not search for sn/xpmem.h -Paul On Fri, Dec 19, 2014 at

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
; gets my "thumbs up" with respect to "Fortran Sadness". -Paul On Fri, Dec 19, 2014 at 12:51 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Jeff, > > Less typing to launch 50+ testers than pick out just those two. > Starting them now... > > -Paul > &g

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
> tarball for you. > > http://www.open-mpi.org/nightly/v1.8/ > > Could you test it in the 2 cases where you had fortran failures? > > > > On Dec 18, 2014, at 8:50 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > Update: > > > > I now have 59

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
On Thu, Dec 18, 2014 at 5:50 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Unless something turns up on the MIPS systems my "smoke test" of rc5 is > complete. In case anybody was holding their breath: The MIPS testers completed just fine. -Pa

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
rors in module > "MPI_F08_SIZEOF". No module information file will be created for this > module. > > if (present(ierror)) ierror = 0 > ^ > "../../../../../../src/openmpi-1.8.4rc5/ompi/mpi/fortran/mpif-h/sizeof-mpif08-pre-1.8.4_f.F90", > Line = 45, Colum

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
"deficient" fortran support. If there is a desire/need to follow up on this, let me know. However, all those "deficient" fortan compilers have been reported by me on this list at least once in testing prior releases (just never in one place). -Paul On Thu, Dec 18, 2014 at 8:

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
On Thu, Dec 18, 2014 at 8:55 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > I also have unexplained errors on my Solaris-10/SPARC system. > It looks like there may have been a loss of network connectivity during > the tests. > I need to check these deeper, but I expect th

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
-atomics/openmpi-1.8.4rc5/ompi/mpi/fortran/mpif-h/sizeof-mpif08-pre-1.8.4_f.F90:104 [...about 180 more lines of similar output...] On Thu, Dec 18, 2014 at 9:30 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > > On Dec 18, 2014, at 11:55 AM, Paul Hargrove <phhargr...@lbl.gov>

[OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
With results from about 50 out of 61 platforms: + KNOWN: SGI UV is still "broken-by-default" (fails compiling vader unless configured with --without-xpmem) + NEW: I see Fortran bindings failing to compile w/ gfortran + NEW: I see Fortran bindings fail to link with Open64 I also have unexplained

Re: [OMPI devel] 1.8.4rc Status

2014-12-18 Thread Paul Hargrove
On Wed, Dec 17, 2014 at 7:17 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > I am going to run the nightly on other configs on both my > Solaris-11/x86-64 and Solaris-10/SPARC systems. > I just want to be sure some other compile/abi/arch combination didn't get > broken by ac

Re: [OMPI devel] 1.8.4rc5 out

2014-12-18 Thread Paul Hargrove
Tests queued on 61 distinct configurations... will share results when I've got them. -Paul On Wed, Dec 17, 2014 at 9:15 PM, Ralph Castain wrote: > > Hi folks > > Trying to bring this to closure, so hopefully this is the last one. Please > give it a smoke test: > >

Re: [OMPI devel] Solaris/x86-64 SEGV with 1.8-latest

2014-12-17 Thread Paul Hargrove
t; *Sent:* Wednesday, December 17, 2014 3:53 PM > *To:* de...@open-mpi.org > *Subject:* Re: [OMPI devel] Solaris/x86-64 SEGV with 1.8-latest > > > > Le 17/12/2014 21:43, Paul Hargrove a écrit : > > > > Dbx gives me > > t@1 (l@1) terminated

[OMPI devel] Solaris/x86-64 SEGV with 1.8-latest

2014-12-17 Thread Paul Hargrove
I tried last nights v1.8 tarball (openmpi-v1.8.3-272-g4e4f997.tar.bz2) with the Studio Compilers (v12.3) on a Solaris/x86-64 system. Configure args (other than prefix) were: --enable-debug --with-verbs \ CC=cc CXX=CC FC=f90 \ CFLAGS=-m64 --with-wrapper-cflags=-m64 \ FCFLAGS=-m64

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
Results of tests described below: 1) SEGV in hwloc - will report later 2) PASS 3) PASS So, both -D_REENTRANT or -mt are working for me IF added both the CFLAGS and wrapper-cflags. -Paul On Tue, Dec 16, 2014 at 10:56 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > I've queued 3

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
commit that > sets the -D_REENTRANT CFLAGS on solaris/solarisstudio > > https://github.com/open-mpi/ompi-release/commit/ac8b84ce674b958dbf8c9481b300beeef0548b83 > > Cheers, > > Gilles > > > On 2014/12/17 15:56, Paul Hargrove wrote: > > I've queued 3 tests: > > 1)

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
> it is worth giving it a try (to be 100.0% sure ...) > > can you please do that tomorrow ? > > in the mean time, if we (well Ralph indeed) want to release 1.8.4, then > simply restore > the two config files i mentionned. > > Cheers, > > Gilles > > > On 2014/12/

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
produce the problem on solaris 11 with sunstudio > 12.4 even if i do not use -D_REENTRANT *nor* -mt (!) > > Cheers, > > Gilles > > > On 2014/12/17 15:01, Ralph Castain wrote: > > Hi Paul > > Can you try the attached patch? It would require running autogen, I fear. > Other

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
fear. > Otherwise, I can add it to the tarball. > > Ralph > > > On Tue, Dec 16, 2014 at 9:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > >> Gilles, >> >> The 1.8.3 test works where the 1.8.4rc4 one fails with identical >> configure arguments. >&

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
; The resulting run worked! So, I very strongly suspect that the problem will be resolved if one restores the configure logic that my previous email shows has vanished (since that would restore "-mt" to CFLAGS and wrapper cflags). -Paul On Tue, Dec 16, 2014 at 8:10 PM, Paul Hargrove

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
My 1.8.3 build has not completed. HOWEVER, I can already see a key difference in the configure step. In 1.8.3 "-mt" was added AUTOMATICALLY to CFLAGS by configure: checking if C compiler and POSIX threads work as is... no - Solaris, not checked checking if C++ compiler and POSIX threads work as

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
? (or is -D_REENTRANT enough ?) > LDFLAGS ? (that might be solaris and/or solarisstudio (12.4) specific and > i simply ignore it) > > Bottom line, i do invite you to test 1.8.4rc4 again and with > CFLAGS="-mt" > or > CFLAGS="-mt -m64" > if you previ

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
s pcp-j-20 > says ? > > BTW, did you try without -m64 ? > > Does the following work > ping/ssh 172.18.0.120 > > Honestly, this output makes very little sense to me, so i am asking way > too much info hoping i can reproduce this issue or get a hint on what can > possibly goes

Re: [OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
sabled and try again. Use of "-mca oob_tcp_if_include bge0" to use a single interface did not fix this. -Paul On Mon, Dec 15, 2014 at 7:18 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Gilles, > > I am NOT seeing the problem with gcc.

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
(at least with gcc compilers) do not > need any flags > (except the -D_REENTRANT that is added automatically) > > Cheers, > > Gilles > > > On 2014/12/16 12:10, Paul Hargrove wrote: > > Gilles, > > I will try the patch when I can. > However, our netwo

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
the CLOSE_THE_SOCKET macro resets errno, and hence the confusing > error message > e.g. failed: Error 0 (0) > > FWIW, master is also affected. > > Cheers, > > Gilles > > > On 2014/12/16 10:47, Paul Hargrove wrote: > > I have tried with a oob_tcp_if_include setting so

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
nd. -Paul On Mon, Dec 15, 2014 at 12:52 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > On Mon, Dec 15, 2014 at 5:35 AM, Ralph Castain <r...@open-mpi.org> wrote: >> >> 7. Linkage issue on Solaris-11 reported by Paul Hargrove. Missing the >> multi-

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
On Mon, Dec 15, 2014 at 5:35 AM, Ralph Castain <r...@open-mpi.org> wrote: > > 7. Linkage issue on Solaris-11 reported by Paul Hargrove. Missing the > multi-threaded C libraries, apparently need "-mt=yes" in both compile and > link. Need someone to investigate. The la

Re: [OMPI devel] 1.8.4rc4 now out for testing

2014-12-15 Thread Paul Hargrove
On Sun, Dec 14, 2014 at 10:52 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Solaris-10/SPARC and "--enable-static --disable-shared" appears broken for > C++ apps (but OK for C). > I will report in more details when I have more information. > Firs

Re: [OMPI devel] 1.8.4rc4 now out for testing

2014-12-15 Thread Paul Hargrove
My testing on 1.8.4rc4 is not quite done, but is getting close. With two exceptions, so far all looks good to me on almost 60 different platforms. I've retested on my Solaris systems and saw none of the issues I had with rc3. The x86-64/Linux system with mtl:psm is no longer giving a SEGV at

[OMPI devel] [1.8.4rc3+patches] Solaris status summary

2014-12-12 Thread Paul Hargrove
It appears that with Ralph's oob_tcp patches (paul.diff) everything is now OK on Solaris-11/x86-64. On Solaris-10/SPARC I needed to fix guess_strlen() (or change "%u" to "%d" to avoid the issue) or else I didn't get very far at all (SEGV in orterun). However, with that issue resolved things are

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
n Fri, Dec 12, 2014 at 5:17 PM, Ralph Castain <r...@open-mpi.org> wrote: > No need for autogen - simple change to a couple of files > > > > On Dec 12, 2014, at 4:38 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > Patches to *code* are fine, but I am n

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
l solve the problem. We can then address > the broader question (e.g., do we even need this stuff any more at all?) in > a more leisurely way. > > > On Dec 12, 2014, at 5:42 PM, Larry Baker <ba...@usgs.gov> wrote: > > On 12 Dec 2014, at 5:22 PM, Paul Hargrove wrote: &

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
, 2014, at 4:34 PM, Ralph Castain <r...@open-mpi.org> wrote: > > I'm hoping it will fix it. The timeout code was the only change from 1.8.3 > besides the loopback warning, so it should restore the prior behavior. > > > On Dec 12, 2014, at 4:32 PM, Paul Hargrove <phhargr...@lb

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
On Fri, Dec 12, 2014 at 4:29 PM, Ralph Castain wrote: > All right - I'll surrender and remove the timeout. Will release rc4 later > tonight. > > Sorry for putting you thru this Paul - for some reason, these problems > aren't showing up elsewhere. > Even at a 300s timeout I

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
ot;-mca oob_tcp_connect_timeout 300", and have also attached the resulting stderr. No joy for either timeout value. -Paul > > On Dec 12, 2014, at 8:53 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > > First, I want to ask what became of the issue discussed in this

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
NOTE: The existing code for "%l." in guess_strlen() is garbage. The va_arg() macro calls all have "int" for the type!! I am *only* testing a fix for the missing "%u" at the moment. -Paul On Fri, Dec 12, 2014 at 3:14 PM, Paul Hargrove <phhargr...@lbl.gov

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
: > Afraid I'm drawing a blank, Paul - I can't see how we got to a bad address > down there. This is at the beginning of orte_init, so there are no threads > running nor has anything much happened. > > Do you have any suggestions? > > > On Dec 12, 2014, at 9:02 AM, Paul Harg

Re: [OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
h Castain <r...@open-mpi.org> wrote: > > Thanks Paul - I will post a fix for this tomorrow. Looks like Sparc isn't > returning an architecture type for some reason, and I didn't protect > against it. > > > On Dec 11, 2014, at 7:39 PM, Paul Hargrove <phhargr...

[OMPI devel] [1.8.4rc3] dangling symlinks

2014-12-12 Thread Paul Hargrove
On a Linux system configured without java support I see the following two dangling symlinks installed in ${prefix}/bin: lrwxrwxrwx 1 phhargrove phhargrove 8 Dec 11 23:52 oshjavac -> mpijavac lrwxrwxrwx 1 phhargrove phhargrove 8 Dec 11 23:52 shmemjavac -> mpijavac It seems there is some logic

Re: [OMPI devel] [1.8.4rc3] false report of no loopback interface + segv at exit

2014-12-12 Thread Paul Hargrove
is test there is a loopback interface, period. > the current code (my bad for not having reviewed in a timely manner) seems to > check > there is a *selected* loopback interface. > > Cheers, > > Gilles > > On 2014/12/12 13:15, Paul Hargrove wrote: > > Ralph,

Re: [OMPI devel] [1.8.4rc3] false report of no loopback interface + segv at exit

2014-12-12 Thread Paul Hargrove
not having reviewed in a timely manner) seems > to check > there is a *selected* loopback interface. > > Cheers, > > Gilles > > > On 2014/12/12 13:15, Paul Hargrove wrote: > > Ralph, > > Sorry to be the bearer of more bad news. > The "good&quo

[OMPI devel] [1.8.4rc3] false report of no loopback interface + segv at exit

2014-12-11 Thread Paul Hargrove
Ralph, Sorry to be the bearer of more bad news. The "good" news is I've seen the new warning regarding the lack of a loopback interface. The BAD news is that it is occurring on a Linux cluster that I'ver verified DOES have 'lo' configured on the front-end and compute nodes (UP and RUNNING

[OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-11 Thread Paul Hargrove
gt; No, that looks different - it's failing in mpirun itself. Can you get a > line number on it? > > Sorry for delay - I'm generating rc3 now > > > On Dec 11, 2014, at 6:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Don't see an rc3 yet. > > My Solaris-10/SPARC

Re: [OMPI devel] [1.8.4rc2] orted SEGVs on Solaris-11/x86-64

2014-12-11 Thread Paul Hargrove
> On Dec 11, 2014, at 3:08 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Testing the 1.8.4rc2 tarball on my x86-64 Solaris-11 systems I am getting > the following crash for both "-m32" and "-m64" builds: > > $ mpirun -mca btl sm,self,openib -np 2 -host

[OMPI devel] [1.8.4rc2] build broken by default on SGI UV

2014-12-11 Thread Paul Hargrove
I think I've reported this earlier in the 1.8 series. If I compile on an SGI UV (e.g. blacklight at PSC) configure picks up the presence of xpmem headers and enables the vader BTL. However, the port of vader to SGI's "flavor" of xpmem is incomplete and the following build failure results:

[OMPI devel] [1.8.4rc2] orted SEGVs on Solaris-11/x86-64

2014-12-11 Thread Paul Hargrove
Testing the 1.8.4rc2 tarball on my x86-64 Solaris-11 systems I am getting the following crash for both "-m32" and "-m64" builds: $ mpirun -mca btl sm,self,openib -np 2 -host pcp-j-19,pcp-j-20 examples/ring_c' [pcp-j-19:18762] *** Process received signal *** [pcp-j-19:18762] Signal: Segmentation

Re: [OMPI devel] still supporting pgi?

2014-12-11 Thread Paul Hargrove
Howard, I regularly test release candidates against the PGI installations on NERSC's systems (and sometimes elsewhere). In fact, have a test of 1.8.4rc2 against pgi-14.4 "in the pipe" right now. I believe Larry Baker of USGS is also a PGI user (in production, rather than just testing as I do).

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-12-02 Thread Paul Hargrove
ot doing so - mostly because it > generally isn't necessary on a cluster. > > > is a backport (since this is already available in the trunk/master) simply > out of the question ? > > > It would be against our normal procedures, but I can raise it at next > wee

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
On Tue, Nov 25, 2014 at 5:37 PM, Ralph Castain wrote: > So it looks like the issue isn't so much with our code as it is with the > OS stack, yes? We aren't requiring that the loopback be "up", but the stack > is in order to establish the connection, even when we are trying a

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
.@open-mpi.org> wrote: > I'll have to look - there isn't supposed to be such a requirement, and I > certainly haven't seen it before. > > > On Nov 25, 2014, at 3:26 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Allan, > > I am glad things are working for you now.

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
nable-debug? >>> >>> If you can do that, or already have, then letâ EURO (tm)s add the following >>> to >>> the mpirun cmd line: >>> >>> -mca state_base_verbose 10 -mca odls_base_verbose 10 -mca >>> oob_base_verbose 10 >>> >>&g

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
rnel, and UNIX-domain sockets and Sys V > IPC are both enabled in the build. Are there any other possibilities I can > check? > > Thanks, > Di > > -- > Di Wu (Allan) > PhD student, VAST Laboratory <http://vast.cs.ucla.edu/>, > Department of Computer Science, UC Los Ange

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
Allan, A likely possibility is that some important kernel feature (that Open MPI assumes is present) is missing. That includes not only "kernel modules" as you mention, but also features configure in (or out) of the base kernel. For instance, some embedded kernels omit UNIX-domain sockets and

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
Ralph, I downloaded the attachment and found it to be a gzipped tar file containing a single text file "log". I have attached the bzipped (not tarred) log file. -Paul On Tue, Nov 25, 2014 at 7:29 AM, Ralph Castain wrote: > I don't know what you put in that log file, but it

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Paul Hargrove
All atomics must be done through not just "the same btl" but the same btl MODULE, since atomics from two IB HCAs, for instance, are not necessarily coherent. So, how is the "best" one to be selected? -Paul [Sent from my phone] On Nov 5, 2014 7:15 AM, "Nathan Hjelm" wrote: > >

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Paul Hargrove
Jeff wrote: MPI_THREAD_MULTIPLE support barely works in v1.8. Why have it on by default, especially when there's a performance penalty? I think the "barely works" state of threading support is a stronger argument for return to the 1.6.x behavior than PSM performance. Who knows what subtle bugs

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
would you be able to try this with the latest trunk tarball? > This looks familiar to me, and I wonder if we are just missing a changeset > from the trunk that fixed the handshake issues we had with failing over > from one transport to another. > > Ralph > > On Nov 3, 2014, at 7

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
think of Gilles's recent issues w/ errno on Solaris unless _REENTRANT was defined. So, I tried building again after configuring with CFLAGS=-D_REENTRANT AND THAT DID THE TRICK. -Paul On Mon, Nov 3, 2014 at 7:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Ralph, > > Requested out

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
set -mca oob_base_verbose 20? I'm not sure why the > connection is failing. > > Thanks > Ralph > > On Nov 3, 2014, at 5:56 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Not clear if the following failure is Solaris-specific, but it *IS* a > regression relative to 1

[OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Not clear if the following failure is Solaris-specific, but it *IS* a regression relative to 1.8.3. The system has 2 IPV4 interfaces: Ethernet on 172.16.0.119/16 IPoIB on 172.18.0.119/16 $ ifconfig bge0 bge0: flags=1004843 mtu 1500 index 2

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
IIRC it was not possible to merge with a dirty tree with git 1.7. So, Dave, you may have been bitten in those dark days. -Paul On Mon, Nov 3, 2014 at 8:49 AM, Dave Goodell (dgoodell) wrote: > On Nov 3, 2014, at 10:41 AM, Jed Brown wrote: > > > "Dave

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
On Mon, Nov 3, 2014 at 8:29 AM, Dave Goodell (dgoodell) wrote: > > btw, is there a push option to abort if that would make github history > non linear ? > > No, not really. There are some options to "pull" to prevent you from > creating a merge commit, but the fix when you

Re: [OMPI devel] Error: undefined reference `__builtin_va_gparg1'

2014-10-29 Thread Paul Hargrove
Amit, You appear to be mixing PGI and GNU compilers, as shown by the "g++" in the final portion of your output. You must configure Open MPI with all compilers (C, C++ and Fortran) from the same "family". -Paul On Wed, Oct 29, 2014 at 1:11 PM, Kumar, Amit wrote: > Dear

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
are suppose to include > all dependencies on headers files, libs, etc. from other cay packages. > > Howard > > > > > 2014-10-28 13:20 GMT-06:00 Ralph Castain <r...@open-mpi.org>: > >> >> On Oct 28, 2014, at 12:17 PM, Paul Hargrove <phhargr...@lbl.gov

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 12:20 PM, Ralph Castain <r...@open-mpi.org> wrote: > On Oct 28, 2014, at 12:17 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well). > > > I

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
/opt/cray/pmi/default/include/pmi_cray.h cray-libpmi-devel-5.0.5-1..10300.134.8.ari -Paul On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Oct 28, 2014, at 11:59 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > On Tue, Oct 2

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard wrote: > >> We may no longer require those as you have separated the Cray check out, >> but the original problem is that we would pickup the Slurm components on >> the Cray because we would find pmi.h >> >> Oh, I forgot

Re: [OMPI devel] errno and reentrance

2014-10-27 Thread Paul Hargrove
27, 2014 at 2:48 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: > Thanks Paul ! > > Gilles > > On 2014/10/27 18:47, Paul Hargrove wrote: > > On Mon, Oct 27, 2014 at 2:42 AM, Gilles Gouaillardet > <gilles.gouaillar...@iferc.org> wrote

Re: [OMPI devel] errno and reentrance

2014-10-27 Thread Paul Hargrove
On Mon, Oct 27, 2014 at 2:42 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: [...] > Paul, since you have access to many platforms, could you please run this > test with and without -D_REENTRANT / -D_THREAD_SAFE > and tell me where the program produces incorrect behaviour (output

Re: [OMPI devel] Deprecated call in sharedfp framework

2014-10-24 Thread Paul Hargrove
I can shed some light on these warnings. sem_init() and sem_destroy() are POSIX-defined interfaces for UNNAMED semaphores. There are also POSX interfaces, sem_{open,close,unlink}(), that operate on NAMED semaphores. See for more info:

Re: [OMPI devel] Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-20 Thread Paul Hargrove
gt; Schinkelstrasse 2, Room 222a > D-52062 Aachen > > Phone +49 (0)241 80 99932 > fri...@cats.rwth-aachen.de > http://www.cats.rwth-aachen.de > > On 18.10.2014, at 02:24, Paul Hargrove <phhargr...@lbl.gov> wrote: > > I know of two possibilities: > >

Re: [OMPI devel] Fwd: Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-17 Thread Paul Hargrove
I know of two possibilities: 1) I cannot be certain but since the message concerns a PC-relative addressing mode, it is possible that something needs to be compiled with -fPIC to fix the issue. See if adding that option to any of the mpicc commands helps. 2) Try adding ONE of "-ll", "-lfl" or

Re: [OMPI devel] RFC: calloc instead of malloc in opal_obj_new()

2014-10-03 Thread Paul Hargrove
ng calloc() when --with-valgrind > is specified on the command line? > > I.e., don't tie it to debug builds, but to valgrind-enabled builds? > > > On Oct 3, 2014, at 6:11 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > I agree with George that zeroing memory

Re: [OMPI devel] RFC: calloc instead of malloc in opal_obj_new()

2014-10-03 Thread Paul Hargrove
I agree with George that zeroing memory only in the debug builds could hide bugs, and thus would want to see the debug and non-debug builds have the same behavior (both malloc or both calloc). So, I also agree this looks initially like a hard choice. What about using malloc() in non-debug builds

Re: [OMPI devel] Bitbucket vs. GitHub (was: Conversion to GitHub: POSTPONED)

2014-09-25 Thread Paul Hargrove
) are at Github. > > 2. I just sent a mail to Github support asking them if they plan to > support per-branch push ACLs. I don't know if they'll be able to give a > direct answer, but it's worth asking. > > > > It would be a little weird to span Github and Bitbucket, but th

Re: [OMPI devel] CONVERSION TO GITHUB

2014-09-17 Thread Paul Hargrove
On Wed, Sep 17, 2014 at 8:06 AM, Jeff Squyres (jsquyres) wrote: > I actually have the mapping already. The *only* ID that is preserved > between the two will be who the ticket is assigned to. You sent out email asking for SVN -> github ID mapping, but did NOT ask about

Re: [OMPI devel] CONVERSION TO GITHUB

2014-09-16 Thread Paul Hargrove
t; Not really. > > One minor point: you'll need a Github account to file Github issues (i.e., > what's replacing Trac tickets) and/or use the code commenting tools. > > > > On Sep 16, 2014, at 2:33 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > Jeff, >

Re: [OMPI devel] 1.8.3rc1 - start your engines

2014-09-16 Thread Paul Hargrove
on Linux. -Paul On Sun, Sep 14, 2014 at 8:55 PM, Ralph Castain <r...@open-mpi.org> wrote: > Your contributions are always appreciated, Paul - thanks! > > On Sep 13, 2014, at 7:51 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > I am not sure if I wi

Re: [OMPI devel] CONVERSION TO GITHUB

2014-09-16 Thread Paul Hargrove
Jeff, Any instructions for those who have never had Subversion accounts, but do have Trac accounts? You know... the people like me who primarily just make work for others :-) -Paul On Tue, Sep 16, 2014 at 10:34 AM, Jeff Squyres (jsquyres) < jsquy...@cisco.com> wrote: > Short version >

Re: [OMPI devel] 1.8.3rc1 - start your engines

2014-09-13 Thread Paul Hargrove
Ralph, I am not sure if I will have time to run my full suite of configurations, including all the PGI, Sun, Intel and IBM compilers on Linux. However, the following non-(Linux/x86-64) platforms have passed: + Linux/{PPC32,PPC64,IA64} + Solaris-10/{SPARC-v8+,SPARC-v9} (Oracle and GNU compilers)

Re: [OMPI devel] Envelope of HINDEXED_BLOCK

2014-08-26 Thread Paul Hargrove
> > libtoolize: putting libltdl files in LT_CONFIG_LTDL_DIR, `opal/libltdl'. > libtoolize: `COPYING.LIB' not found in `/usr/share/libtool/libltdl' > autoreconf: libtoolize failed with exit status: 1 > > The error message is from libtoolize about a file missing from the libtool installation

Re: [OMPI devel] [OMPI svn] svn:open-mpi r32555 - trunk/opal/mca/btl/scif

2014-08-20 Thread Paul Hargrove
Can somebody confirm that configure is adding "-c9x" or "-c99" to CFLAGS with this compiler? If not then r32555 could possibly be reverted in favor of adding the proper compiler flag. Also, I am suspicious of this failure because even without a language-level option pgcc 12.9 and 13.4 compile the

Re: [OMPI devel] 1.8.4rc4 is out

2014-08-15 Thread Paul Hargrove
My testing has additionally passed on IA64 ARM - v5 and v7 MIPS - "32", "n32" and "64" ABIs -Paul On Wed, Aug 13, 2014 at 9:18 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > I have completed testing the majority of the platforms I have access t

Re: [OMPI devel] 1.8.4rc4 is out

2014-08-14 Thread Paul Hargrove
rify the configure --help message about when OpenSHMEM is > enabled/disabled by default. Thanks to Paul Hargrove for the > suggestion. > - Align pages properly where relevant. Thanks to Paul Hargrove for > identifying the issue. > - Various compiler warning and minor fixes

Re: [OMPI devel] [1.8.2rc4] build failure with --enable-osx-builtin-atomics

2014-08-14 Thread Paul Hargrove
Fix confirmed using the nightly tarball (v1.8rc5r32531). -Paul On Wed, Aug 13, 2014 at 6:16 PM, Ralph Castain <r...@open-mpi.org> wrote: > Thanks Paul - fixed in r32530 > > > > On Wed, Aug 13, 2014 at 2:42 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > >&g

[OMPI devel] [1.8.2rc4] OSHMEM fortran bindings with bad compilers

2014-08-13 Thread Paul Hargrove
The following is NOT a bug report. This is just an observation that may deserve some text in the README. I've reported issues in the past with some Fortran compilers (mostly older XLC and PGI) which either cannot build the "use mpi_f08" module, or cannot correctly link to it (and sometimes this

[OMPI devel] [1.8.2rc4] build failure with --enable-osx-builtin-atomics

2014-08-13 Thread Paul Hargrove
When configured with --enable-osx-builtin-atomics Making all in asm CC asm.lo In file included from /Users/Paul/OMPI/openmpi-1.8.2rc4-macos10.8-x86-clang-atomics/openmpi-1.8.2rc4/opal/asm/asm.c:21:

Re: [OMPI devel] trunk hang when nodes have similar but private network

2014-08-13 Thread Paul Hargrove
I think that in this case one *could* add logic that would disqualify the subnet because every compute node in the job has the SAME address. In fact, any subnet on which two or more compute nodes have the same address must be suspect. If this logic were introduced, the 127.0.0.1 loopback address

Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-11 Thread Paul Hargrove
atch > I > >> submitted and that Paul validated. > >> What Nathan said he might take a look at is a different method for > >> generating assembly code, one that only supports ARMv7 and later. > >> George. > >> > >> On Mon, Aug 11, 20

Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-11 Thread Paul Hargrove
I am on the same page with George here - if it's on the list then support it until its been removed. I happen to have systems to test, I believe, every supported atomics implementation except for DEC Alpha, and so I did test them all. AFAIK ARMv5 is even out-dated as a smartphone platform.

Re: [OMPI devel] [v1.8] build failure with xlc-11.1

2014-08-09 Thread Paul Hargrove
configure/build could produce the reported failure, my test did NOT represent a valid set of conditions. -Paul On Sat, Aug 9, 2014 at 1:29 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > One note regarding my report below: > > I have noticed that autoconf has chosen to use "$src

Re: [OMPI devel] [v1.8] build failure with xlc-11.1

2014-08-09 Thread Paul Hargrove
does seem to be some flaw in how the atomics are getting built on this configuration. -Paul On Sat, Aug 9, 2014 at 1:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Building v1.8 nightly tarball with xlc-11.1 on a ppc64/linux platform: > > Making all in asm > make[2]: Ent

[OMPI devel] cosmetic configure nit

2014-08-09 Thread Paul Hargrove
One too many 's' characters in the following: checking for asssembly architecture... -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax:

[OMPI devel] [v1.8] build failure with xlc-11.1

2014-08-09 Thread Paul Hargrove
Building v1.8 nightly tarball with xlc-11.1 on a ppc64/linux platform: Making all in asm make[2]: Entering directory `/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/BLD/opal/asm' CC asm.lo rm -f atomic-asm.S ln -s "../../opal/asm/generated/atomic-local.s" atomic-asm.S CPPAS

Re: [OMPI devel] [v1.8] 32-bit c++ build failure with Sun compilers

2014-08-09 Thread Paul Hargrove
- thanks! > > On Aug 9, 2014, at 12:24 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > Yes, that did the trick. > The attached patch applied cleanly to last night's v1.8 tarball > (1.8.2rc4r32480) and I was able to build the C++ bindings on this platform. >

<    1   2   3   4   5   6   7   8   9   >