Re: [OMPI devel] Conversion to GitHub: POSTPONED

2014-09-23 Thread Paul Hargrove
, 2014, at 7:52 PM, Jed Brown wrote: > > > >> I don't have experience with GerritHub, but Bitbucket supports this > >> feature (permissions on branch names/globs) and we use it in PETSc. > > Thanks for the info. Paul Hargrove said pretty much the same thing to

Re: [OMPI devel] Bitbucket vs. GitHub (was: Conversion to GitHub: POSTPONED)

2014-09-25 Thread Paul Hargrove
Bitbucket today. It may be a workable > model that the main OMPI repo (and wiki and tickets) is at Bitbucket, and > most other repos (and wikis and tickets) are at Github. > > 2. I just sent a mail to Github support asking them if they plan to > support per-branch push ACLs. I don&#x

Re: [OMPI devel] Bitbucket vs. GitHub (was: Conversion to GitHub: POSTPONED)

2014-09-25 Thread Paul Hargrove
On Thu, Sep 25, 2014 at 2:28 PM, Jed Brown wrote: > Paul Hargrove writes: > > The GUIs for things like browsing commits, viewing diffs, etc are pretty > > similar in capability and each is sufficiently intuitive (after a brief > > learning curve) that I don't find I ne

Re: [OMPI devel] RFC: calloc instead of malloc in opal_obj_new()

2014-10-03 Thread Paul Hargrove
I agree with George that zeroing memory only in the debug builds could hide bugs, and thus would want to see the debug and non-debug builds have the same behavior (both malloc or both calloc). So, I also agree this looks initially like a hard choice. What about using malloc() in non-debug builds

Re: [OMPI devel] RFC: calloc instead of malloc in opal_obj_new()

2014-10-03 Thread Paul Hargrove
) when --with-valgrind > is specified on the command line? > > I.e., don't tie it to debug builds, but to valgrind-enabled builds? > > > On Oct 3, 2014, at 6:11 PM, Paul Hargrove wrote: > > > I agree with George that zeroing memory only in the debug builds could > hid

Re: [OMPI devel] Fwd: Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-17 Thread Paul Hargrove
I know of two possibilities: 1) I cannot be certain but since the message concerns a PC-relative addressing mode, it is possible that something needs to be compiled with -fPIC to fix the issue. See if adding that option to any of the mpicc commands helps. 2) Try adding ONE of "-ll", "-lfl" or "-

Re: [OMPI devel] Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-20 Thread Paul Hargrove
> D-52062 Aachen > > Phone +49 (0)241 80 99932 > fri...@cats.rwth-aachen.de > http://www.cats.rwth-aachen.de > > On 18.10.2014, at 02:24, Paul Hargrove wrote: > > I know of two possibilities: > > 1) I cannot be certain but since the message concerns a PC-re

Re: [OMPI devel] Deprecated call in sharedfp framework

2014-10-24 Thread Paul Hargrove
I can shed some light on these warnings. sem_init() and sem_destroy() are POSIX-defined interfaces for UNNAMED semaphores. There are also POSX interfaces, sem_{open,close,unlink}(), that operate on NAMED semaphores. See for more info: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sema

Re: [OMPI devel] errno and reentrance

2014-10-27 Thread Paul Hargrove
On Mon, Oct 27, 2014 at 2:42 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: [...] > Paul, since you have access to many platforms, could you please run this > test with and without -D_REENTRANT / -D_THREAD_SAFE > and tell me where the program produces incorrect behaviour (output i

Re: [OMPI devel] errno and reentrance

2014-10-27 Thread Paul Hargrove
l On Mon, Oct 27, 2014 at 2:48 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: > Thanks Paul ! > > Gilles > > On 2014/10/27 18:47, Paul Hargrove wrote: > > On Mon, Oct 27, 2014 at 2:42 AM, Gilles Gouaillardet > wrote: > [...] > >

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard wrote: > >> We may no longer require those as you have separated the Cray check out, >> but the original problem is that we would pickup the Slurm components on >> the Cray because we would find pmi.h >> >> Oh, I forgot about that . > In GASNet

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h cray-libpmi-devel-5.0.5-1..10300.134.8.ari -Paul On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain wrote: > > On Oct 28, 2014, at 11:59 AM, Paul Hargrove wrote: > > > On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 12:20 PM, Ralph Castain wrote: > On Oct 28, 2014, at 12:17 PM, Paul Hargrove wrote: > > Ralph, > > The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well). > > > I understand that - I was questioning if that is univers

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
kages are suppose to include > all dependencies on headers files, libs, etc. from other cay packages. > > Howard > > > > > 2014-10-28 13:20 GMT-06:00 Ralph Castain : > >> >> On Oct 28, 2014, at 12:17 PM, Paul Hargrove wrote: >> >> Ralph, >&g

Re: [OMPI devel] Error: undefined reference `__builtin_va_gparg1'

2014-10-29 Thread Paul Hargrove
Amit, You appear to be mixing PGI and GNU compilers, as shown by the "g++" in the final portion of your output. You must configure Open MPI with all compilers (C, C++ and Fortran) from the same "family". -Paul On Wed, Oct 29, 2014 at 1:11 PM, Kumar, Amit wrote: > Dear Developers, > > I have r

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
On Mon, Nov 3, 2014 at 8:29 AM, Dave Goodell (dgoodell) wrote: > > btw, is there a push option to abort if that would make github history > non linear ? > > No, not really. There are some options to "pull" to prevent you from > creating a merge commit, but the fix when you encounter that situati

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
IIRC it was not possible to merge with a dirty tree with git 1.7. So, Dave, you may have been bitten in those dark days. -Paul On Mon, Nov 3, 2014 at 8:49 AM, Dave Goodell (dgoodell) wrote: > On Nov 3, 2014, at 10:41 AM, Jed Brown wrote: > > > "Dave Goodell (dgoodell)" writes: > >> Most of the

[OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Not clear if the following failure is Solaris-specific, but it *IS* a regression relative to 1.8.3. The system has 2 IPV4 interfaces: Ethernet on 172.16.0.119/16 IPoIB on 172.18.0.119/16 $ ifconfig bge0 bge0: flags=1004843 mtu 1500 index 2 inet 172.16.0.119 netmask broadcas

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
27;m not sure why the > connection is failing. > > Thanks > Ralph > > On Nov 3, 2014, at 5:56 PM, Paul Hargrove wrote: > > Not clear if the following failure is Solaris-specific, but it *IS* a > regression relative to 1.8.3. > > The system has 2 IPV4 interfaces: >

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
think of Gilles's recent issues w/ errno on Solaris unless _REENTRANT was defined. So, I tried building again after configuring with CFLAGS=-D_REENTRANT AND THAT DID THE TRICK. -Paul On Mon, Nov 3, 2014 at 7:23 PM, Paul Hargrove wrote: > Ralph, > > Requested output is attached. >

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
the latest trunk tarball? > This looks familiar to me, and I wonder if we are just missing a changeset > from the trunk that fixed the handshake issues we had with failing over > from one transport to another. > > Ralph > > On Nov 3, 2014, at 7:23 PM, Paul Hargrove wrote:

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Paul Hargrove
Jeff wrote: MPI_THREAD_MULTIPLE support barely works in v1.8. Why have it on by default, especially when there's a performance penalty? I think the "barely works" state of threading support is a stronger argument for return to the 1.6.x behavior than PSM performance. Who knows what subtle bugs h

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Paul Hargrove
All atomics must be done through not just "the same btl" but the same btl MODULE, since atomics from two IB HCAs, for instance, are not necessarily coherent. So, how is the "best" one to be selected? -Paul [Sent from my phone] On Nov 5, 2014 7:15 AM, "Nathan Hjelm" wrote: > > In the new osc com

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
Ralph, I downloaded the attachment and found it to be a gzipped tar file containing a single text file "log". I have attached the bzipped (not tarred) log file. -Paul On Tue, Nov 25, 2014 at 7:29 AM, Ralph Castain wrote: > I don't know what you put in that log file, but it was an executable an

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
Allan, A likely possibility is that some important kernel feature (that Open MPI assumes is present) is missing. That includes not only "kernel modules" as you mention, but also features configure in (or out) of the base kernel. For instance, some embedded kernels omit UNIX-domain sockets and SysV

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
IX-domain sockets and Sys V > IPC are both enabled in the build. Are there any other possibilities I can > check? > > Thanks, > Di > > -- > Di Wu (Allan) > PhD student, VAST Laboratory <http://vast.cs.ucla.edu/>, > Department of Computer Science, UC Los Ang

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
; Regards, >> Di >> >> On Tue, Nov 25, 2014 at 2:25 PM, Ralph Castain wrote: >> >>> >>> This is all running on a single node, correct? If so, did you configure >>> OMPI with â EURO "enable-debug? >>> >>> If you can do that, or

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
t; I'll have to look - there isn't supposed to be such a requirement, and I > certainly haven't seen it before. > > > On Nov 25, 2014, at 3:26 PM, Paul Hargrove wrote: > > Allan, > > I am glad things are working for you now. > I can confirm (on a QE

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-11-25 Thread Paul Hargrove
On Tue, Nov 25, 2014 at 5:37 PM, Ralph Castain wrote: > So it looks like the issue isn't so much with our code as it is with the > OS stack, yes? We aren't requiring that the loopback be "up", but the stack > is in order to establish the connection, even when we are trying a non-lo > interface.

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-12-02 Thread Paul Hargrove
ause it > generally isn't necessary on a cluster. > > > is a backport (since this is already available in the trunk/master) simply > out of the question ? > > > It would be against our normal procedures, but I can raise it at next > week's meeting. > > &g

Re: [OMPI devel] still supporting pgi?

2014-12-11 Thread Paul Hargrove
Howard, I regularly test release candidates against the PGI installations on NERSC's systems (and sometimes elsewhere). In fact, have a test of 1.8.4rc2 against pgi-14.4 "in the pipe" right now. I believe Larry Baker of USGS is also a PGI user (in production, rather than just testing as I do).

[OMPI devel] [1.8.4rc2] orted SEGVs on Solaris-11/x86-64

2014-12-11 Thread Paul Hargrove
Testing the 1.8.4rc2 tarball on my x86-64 Solaris-11 systems I am getting the following crash for both "-m32" and "-m64" builds: $ mpirun -mca btl sm,self,openib -np 2 -host pcp-j-19,pcp-j-20 examples/ring_c' [pcp-j-19:18762] *** Process received signal *** [pcp-j-19:18762] Signal: Segmentation Fa

[OMPI devel] [1.8.4rc2] build broken by default on SGI UV

2014-12-11 Thread Paul Hargrove
I think I've reported this earlier in the 1.8 series. If I compile on an SGI UV (e.g. blacklight at PSC) configure picks up the presence of xpmem headers and enables the vader BTL. However, the port of vader to SGI's "flavor" of xpmem is incomplete and the following build failure results: make[2]:

Re: [OMPI devel] [1.8.4rc2] orted SEGVs on Solaris-11/x86-64

2014-12-11 Thread Paul Hargrove
+0x5c [niagara1:29881] *** End of error message *** Segmentation Fault - core dumped On Thu, Dec 11, 2014 at 3:29 PM, Ralph Castain wrote: > Ah crud - incomplete commit means we didn't send the topo string. Will > roll rc3 in a few minutes. > > Thanks, Paul > Ralph > > On

[OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-11 Thread Paul Hargrove
ifferent - it's failing in mpirun itself. Can you get a > line number on it? > > Sorry for delay - I'm generating rc3 now > > > On Dec 11, 2014, at 6:59 PM, Paul Hargrove wrote: > > Don't see an rc3 yet. > > My Solaris-10/SPARC runs fail slightly differently (

[OMPI devel] [1.8.4rc3] false report of no loopback interface + segv at exit

2014-12-11 Thread Paul Hargrove
Ralph, Sorry to be the bearer of more bad news. The "good" news is I've seen the new warning regarding the lack of a loopback interface. The BAD news is that it is occurring on a Linux cluster that I'ver verified DOES have 'lo' configured on the front-end and compute nodes (UP and RUNNING accordin

Re: [OMPI devel] [1.8.4rc3] false report of no loopback interface + segv at exit

2014-12-12 Thread Paul Hargrove
for not having reviewed in a timely manner) seems > to check > there is a *selected* loopback interface. > > Cheers, > > Gilles > > > On 2014/12/12 13:15, Paul Hargrove wrote: > > Ralph, > > Sorry to be the bearer of more bad news. > The "good" n

Re: [OMPI devel] [1.8.4rc3] false report of no loopback interface + segv at exit

2014-12-12 Thread Paul Hargrove
ngs : mpirun + 2 orted > + 2 mpi tasks > > do you have any oob_tcp_if_include or oob_tcp_if_exclude settings in your > openmpi-mca-params.conf ? > > here is attached a patch to fix this issue. > what we really want is test there is a loopback interface, period. > the current c

[OMPI devel] [1.8.4rc3] dangling symlinks

2014-12-12 Thread Paul Hargrove
On a Linux system configured without java support I see the following two dangling symlinks installed in ${prefix}/bin: lrwxrwxrwx 1 phhargrove phhargrove 8 Dec 11 23:52 oshjavac -> mpijavac lrwxrwxrwx 1 phhargrove phhargrove 8 Dec 11 23:52 shmemjavac -> mpijavac It seems there is some logic mi

[OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
First, I want to ask what became of the issue discussed in this thread? http://www.open-mpi.org/community/lists/devel/2014/11/16160.php I though we had concluded that one just needed -D_REENTRANT. I mention that only for completeness, because I think my current problem is different. The followi

Re: [OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
stain wrote: > > Thanks Paul - I will post a fix for this tomorrow. Looks like Sparc isn't > returning an architecture type for some reason, and I didn't protect > against it. > > > On Dec 11, 2014, at 7:39 PM, Paul Hargrove wrote: > > Backtrace for the Solaris-10/SPARC S

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
> Afraid I'm drawing a blank, Paul - I can't see how we got to a bad address > down there. This is at the beginning of orte_init, so there are no threads > running nor has anything much happened. > > Do you have any suggestions? > > > On Dec 12, 2014,

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
NOTE: The existing code for "%l." in guess_strlen() is garbage. The va_arg() macro calls all have "int" for the type!! I am *only* testing a fix for the missing "%u" at the moment. -Paul On Fri, Dec 12, 2014 at 3:14 PM, Paul Hargrove wrote: > Thanks, Gille

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
imeout 300", and have also attached the resulting stderr. No joy for either timeout value. -Paul > > On Dec 12, 2014, at 8:53 AM, Paul Hargrove wrote: > > > > First, I want to ask what became of the issue discussed in this thread? >http://www.open-mpi.org/community

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
On Fri, Dec 12, 2014 at 4:29 PM, Ralph Castain wrote: > All right - I'll surrender and remove the timeout. Will release rc4 later > tonight. > > Sorry for putting you thru this Paul - for some reason, these problems > aren't showing up elsewhere. > Even at a 300s timeout I don't get a connection

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
ph Castain wrote: > > I'm hoping it will fix it. The timeout code was the only change from 1.8.3 > besides the loopback warning, so it should restore the prior behavior. > > > On Dec 12, 2014, at 4:32 PM, Paul Hargrove wrote: > > > On Fri, Dec 12, 2014 at 4:2

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
have been written to the * final string if enough space had been available. */ static int guess_strlen(const char *fmt, va_list ap) { char dummy[1]; return 1 + vsnprintf(dummy, 1, fmt, ap); } BTW: I do see some messages like "select: Interrupted system call" which I assume are

Re: [OMPI devel] OMPI devel] [1.8.4rc2] orterun SEGVs on Solaris-10/SPARC

2014-12-12 Thread Paul Hargrove
ve the problem. We can then address > the broader question (e.g., do we even need this stuff any more at all?) in > a more leisurely way. > > > On Dec 12, 2014, at 5:42 PM, Larry Baker wrote: > > On 12 Dec 2014, at 5:22 PM, Paul Hargrove wrote: > > HOWEVER, while the patch

Re: [OMPI devel] [1.8.4rc3] REGRESSION: connection problem on (multi-homed) Solaris host

2014-12-12 Thread Paul Hargrove
n Fri, Dec 12, 2014 at 5:17 PM, Ralph Castain wrote: > No need for autogen - simple change to a couple of files > > > > On Dec 12, 2014, at 4:38 PM, Paul Hargrove wrote: > > Ralph, > > Patches to *code* are fine, but I am not equipped to autogen. > > -Paul > > On

[OMPI devel] [1.8.4rc3+patches] Solaris status summary

2014-12-12 Thread Paul Hargrove
It appears that with Ralph's oob_tcp patches (paul.diff) everything is now OK on Solaris-11/x86-64. On Solaris-10/SPARC I needed to fix guess_strlen() (or change "%u" to "%d" to avoid the issue) or else I didn't get very far at all (SEGV in orterun). However, with that issue resolved things are st

Re: [OMPI devel] 1.8.4rc4 now out for testing

2014-12-15 Thread Paul Hargrove
My testing on 1.8.4rc4 is not quite done, but is getting close. With two exceptions, so far all looks good to me on almost 60 different platforms. I've retested on my Solaris systems and saw none of the issues I had with rc3. The x86-64/Linux system with mtl:psm is no longer giving a SEGV at exit.

Re: [OMPI devel] 1.8.4rc4 now out for testing

2014-12-15 Thread Paul Hargrove
On Sun, Dec 14, 2014 at 10:52 PM, Paul Hargrove wrote: > > Solaris-10/SPARC and "--enable-static --disable-shared" appears broken for > C++ apps (but OK for C). > I will report in more details when I have more information. > First the good news: The problem I was exper

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
On Mon, Dec 15, 2014 at 5:35 AM, Ralph Castain wrote: > > 7. Linkage issue on Solaris-11 reported by Paul Hargrove. Missing the > multi-threaded C libraries, apparently need "-mt=yes" in both compile and > link. Need someone to investigate. The lack of multi-thread libra

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
nd. -Paul On Mon, Dec 15, 2014 at 12:52 PM, Paul Hargrove wrote: > > > On Mon, Dec 15, 2014 at 5:35 AM, Ralph Castain wrote: >> >> 7. Linkage issue on Solaris-11 reported by Paul Hargrove. Missing the >> multi-threaded C libraries, apparently need "-mt=yes" i

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
. I am getting less certain that my speculation about thread-safe libs is correct. -Paul On Mon, Dec 15, 2014 at 1:24 PM, Paul Hargrove wrote: > > A little more reading finds that... > > Docs says that one needs "-mt&q

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
the CLOSE_THE_SOCKET macro resets errno, and hence the confusing > error message > e.g. failed: Error 0 (0) > > FWIW, master is also affected. > > Cheers, > > Gilles > > > On 2014/12/16 10:47, Paul Hargrove wrote: > > I have tried with a oob_tcp_if_include setting so

Re: [OMPI devel] 1.8.4rc Status

2014-12-15 Thread Paul Hargrove
is 11 (at least with gcc compilers) do not > need any flags > (except the -D_REENTRANT that is added automatically) > > Cheers, > > Gilles > > > On 2014/12/16 12:10, Paul Hargrove wrote: > > Gilles, > > I will try the patch when I can. > However, our network is u

Re: [OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
sabled and try again. Use of "-mca oob_tcp_if_include bge0" to use a single interface did not fix this. -Paul On Mon, Dec 15, 2014 at 7:18 PM, Paul Hargrove wrote: > > Gilles, > > I am NOT seeing the problem with gcc. > It is only occurring with the S

Re: [OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
des) > > Cheers, > > Gilles > > > On 2014/12/16 16:00, Paul Hargrove wrote: > > Gilles, > > I looked again carefully and I am *NOT* finding -D_REENTRANT passed to most > compilations. > It appears to be used for building libevent and vt, but nothing else.

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
ts pcp-j-20 > says ? > > BTW, did you try without -m64 ? > > Does the following work > ping/ssh 172.18.0.120 > > Honestly, this output makes very little sense to me, so i am asking way > too much info hoping i can reproduce this issue or get a hint on what can > possibly

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
S ? (or is -D_REENTRANT enough ?) > LDFLAGS ? (that might be solaris and/or solarisstudio (12.4) specific and > i simply ignore it) > > Bottom line, i do invite you to test 1.8.4rc4 again and with > CFLAGS="-mt" > or > CFLAGS="-mt -m64" > if you previ

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-16 Thread Paul Hargrove
My 1.8.3 build has not completed. HOWEVER, I can already see a key difference in the configure step. In 1.8.3 "-mt" was added AUTOMATICALLY to CFLAGS by configure: checking if C compiler and POSIX threads work as is... no - Solaris, not checked checking if C++ compiler and POSIX threads work as i

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
; The resulting run worked! So, I very strongly suspect that the problem will be resolved if one restores the configure logic that my previous email shows has vanished (since that would restore "-mt" to CFLAGS and wrapper cflags). -Paul On Tue, Dec 16, 2014 at 8:10 PM, Paul Hargrove wrote:

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
to the tarball. > > Ralph > > > On Tue, Dec 16, 2014 at 9:59 PM, Paul Hargrove wrote: > >> Gilles, >> >> The 1.8.3 test works where the 1.8.4rc4 one fails with identical >> configure arguments. >> >> While it may be overkill, I conf

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
W. i was unable to reproduce the problem on solaris 11 with sunstudio > 12.4 even if i do not use -D_REENTRANT *nor* -mt (!) > > Cheers, > > Gilles > > > On 2014/12/17 15:01, Ralph Castain wrote: > > Hi Paul > > Can you try the attached patch? It would require run

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
gt; it is worth giving it a try (to be 100.0% sure ...) > > can you please do that tomorrow ? > > in the mean time, if we (well Ralph indeed) want to release 1.8.4, then > simply restore > the two config files i mentionned. > > Cheers, > > Gilles > > > On 201

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
it that > sets the -D_REENTRANT CFLAGS on solaris/solarisstudio > > https://github.com/open-mpi/ompi-release/commit/ac8b84ce674b958dbf8c9481b300beeef0548b83 > > Cheers, > > Gilles > > > On 2014/12/17 15:56, Paul Hargrove wrote: > > I've queued 3 tests: > &g

Re: [OMPI devel] OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
Results of tests described below: 1) SEGV in hwloc - will report later 2) PASS 3) PASS So, both -D_REENTRANT or -mt are working for me IF added both the CFLAGS and wrapper-cflags. -Paul On Tue, Dec 16, 2014 at 10:56 PM, Paul Hargrove wrote: > > I've queued 3 tests: > > 1) o

[OMPI devel] Solaris/x86-64 SEGV with 1.8-latest

2014-12-17 Thread Paul Hargrove
I tried last nights v1.8 tarball (openmpi-v1.8.3-272-g4e4f997.tar.bz2) with the Studio Compilers (v12.3) on a Solaris/x86-64 system. Configure args (other than prefix) were: --enable-debug --with-verbs \ CC=cc CXX=CC FC=f90 \ CFLAGS=-m64 --with-wrapper-cflags=-m64 \ FCFLAGS=-m64 --with-wrapper-fcf

Re: [OMPI devel] Solaris/x86-64 SEGV with 1.8-latest

2014-12-17 Thread Paul Hargrove
Wednesday, December 17, 2014 3:53 PM > *To:* de...@open-mpi.org > *Subject:* Re: [OMPI devel] Solaris/x86-64 SEGV with 1.8-latest > > > > Le 17/12/2014 21:43, Paul Hargrove a écrit : > > > > Dbx gives me > > t@1 (l@1) terminated by signal SEGV (no mapping

Re: [OMPI devel] 1.8.4rc Status

2014-12-17 Thread Paul Hargrove
.8 tree, and is in the latest > nightly tarball. > > If I'm following this thread right -- and I might not be! -- I think > Gilles is saying that now that the __sun check is in, it should fix this > -mt/-D_REENTRANT/whatever problem. > > Can you confirm? > > > On Dec 16,

Re: [OMPI devel] 1.8.4rc5 out

2014-12-18 Thread Paul Hargrove
Tests queued on 61 distinct configurations... will share results when I've got them. -Paul On Wed, Dec 17, 2014 at 9:15 PM, Ralph Castain wrote: > > Hi folks > > Trying to bring this to closure, so hopefully this is the last one. Please > give it a smoke test: > > http://www.open-mpi.org/softwar

Re: [OMPI devel] 1.8.4rc Status

2014-12-18 Thread Paul Hargrove
On Wed, Dec 17, 2014 at 7:17 PM, Paul Hargrove wrote: > > I am going to run the nightly on other configs on both my > Solaris-11/x86-64 and Solaris-10/SPARC systems. > I just want to be sure some other compile/abi/arch combination didn't get > broken by accident. > I will

[OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
With results from about 50 out of 61 platforms: + KNOWN: SGI UV is still "broken-by-default" (fails compiling vader unless configured with --without-xpmem) + NEW: I see Fortran bindings failing to compile w/ gfortran + NEW: I see Fortran bindings fail to link with Open64 I also have unexplained e

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
argrove/GSCRATCH/OMPI/openmpi-1.8.4rc5-linux-x86_64-gcc-atomics/openmpi-1.8.4rc5/ompi/mpi/fortran/mpif-h/sizeof-mpif08-pre-1.8.4_f.F90:104 [...about 180 more lines of similar output...] On Thu, Dec 18, 2014 at 9:30 AM, Jeff Squyres (jsquyres) wrote: > > On Dec 18, 2014, at 11:55 AM, Paul Ha

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
On Thu, Dec 18, 2014 at 8:55 AM, Paul Hargrove wrote: > > I also have unexplained errors on my Solaris-10/SPARC system. > It looks like there may have been a loss of network connectivity during > the tests. > I need to check these deeper, but I expect them to pass when I get a >

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-18 Thread Paul Hargrove
"deficient" fortran support. If there is a desire/need to follow up on this, let me know. However, all those "deficient" fortan compilers have been reported by me on this list at least once in testing prior releases (just never in one place). -Paul On Thu, Dec 18, 2014 at

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
iler has detected errors in module > "MPI_F08_SIZEOF". No module information file will be created for this > module. > > if (present(ierror)) ierror = 0 > ^ > "../../../../../../src/openmpi-1.8.4rc5/ompi/mpi/fortran/mpif-h/sizeof-mpif08-pre-1.8.4_f.F90", &

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
On Thu, Dec 18, 2014 at 5:50 PM, Paul Hargrove wrote: > > Unless something turns up on the MIPS systems my "smoke test" of rc5 is > complete. In case anybody was holding their breath: The MIPS testers completed just fine. -Paul -- Paul H. Hargrove

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
ghtly > tarball for you. > > http://www.open-mpi.org/nightly/v1.8/ > > Could you test it in the 2 cases where you had fortran failures? > > > > On Dec 18, 2014, at 8:50 PM, Paul Hargrove wrote: > > > Update: > > > > I now have 59 of 61 results, with

Re: [OMPI devel] [1.8.4rc5] preliminary results

2014-12-19 Thread Paul Hargrove
my "thumbs up" with respect to "Fortran Sadness". -Paul On Fri, Dec 19, 2014 at 12:51 PM, Paul Hargrove wrote: > Jeff, > > Less typing to launch 50+ testers than pick out just those two. > Starting them now... > > -Paul > > On Fri, Dec 19, 2014 at

Re: [OMPI devel] [Open MPI Announce] Open MPI 1.8.4 released

2014-12-20 Thread Paul Hargrove
Sorry to rain on the parade, but SGI UV is still broken by default. I reported this as present in 1.8.4rc5 and Nathan had claimed to be working on it. A reminder that all it takes is a 1-line change in ompi/mca/btl/vader/configure.m4 to not search for sn/xpmem.h -Paul On Fri, Dec 19, 2014 at 7:2

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-29 Thread Paul Hargrove
Jeff, If I understand one is (or will be soon) expected to have libtool-dev(el) installed on the build system, even if one is not a OMPI developer. How does this plan to cease embedding libltdl align with the fact that autogen.pl currently applies patches to the parts of the generated configure f

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
enough autotools to autogen on this old system then I wouldn't have asked about libltdl from libtool-1.4. So, please *do* generate a tarball and I will test (on *all* of my systems). -Paul On Fri, Jan 30, 2015 at 3:49 AM, Jeff Squyres (jsquyres) wrote: > On Jan 29, 2015, at 9:11 PM, Paul H

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
i, Jan 30, 2015 at 1:29 PM, Jeff Squyres (jsquyres) wrote: > On Jan 30, 2015, at 2:46 PM, Paul Hargrove wrote: > > > > If I had new enough autotools to autogen on this old system then I > wouldn't have asked about libltdl from libtool-1.4. So, please *do* > generate a tarball

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
featuring 100% fewer "make check" > failures. > > http://www.open-mpi.org/~jsquyres/unofficial/ > > > > On Jan 30, 2015, at 5:14 PM, Jeff Squyres (jsquyres) > wrote: > > > > Shame on me for not running "make check". > > > > Fixing... &g

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-01-30 Thread Paul Hargrove
at 5:14 PM, Jeff Squyres (jsquyres) > wrote: > > > > Shame on me for not running "make check". > > > > Fixing... > > > > > >> On Jan 30, 2015, at 4:58 PM, Paul Hargrove wrote: > >> > >> Jeff, > >> > >> I ra

[OMPI devel] confusing output when no c++ compiler

2015-02-02 Thread Paul Hargrove
The output below occurred testing Jeff's no-embedded-libltdl tarball, but I am assuming in quite likely the same is true on the trunk. The "issue" is that I am told by configure that "C and C++ compilers are not link compatible". However, it appears I just don't have a C++ compiler at all!! I am

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
the hopper system at Nersc. > > Do you have any Cray insight here? (see below for the exact issue) > > > > On Feb 1, 2015, at 3:52 AM, Paul Hargrove wrote: > > > > Jeff (off-list), > > > > Original make was with V=1, so I skipped the "make clean"

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
t 4:44 AM, Jeff Squyres (jsquyres) wrote: > Looks like the lt_interface.c code didn't properly use the lt_dladvise > #if. How did that ever work, I wonder? > > Fixed now. On to your second finding... > > > > On Jan 30, 2015, at 7:42 PM, Paul Hargrove wrote: > > >

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 1:58 PM, Paul Hargrove wrote: > 2b. I am retrying now with all of Cray's environment modules unloaded > except the one for the PGI compiler. Nathan had suggested something like > this to me in the past, but I've never had issues with the default >

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
had fixed it in my local tree but not yet pushed to my github branch; I > was waiting to see what happened w.r.t. your failure on the NERSC machine. > > I pushed the fix up to my branch now; do you want a new tarball? > > > > On Feb 2, 2015, at 5:56 PM, Paul Hargrove wrote: >

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
res (jsquyres) wrote: > Paul -- > > If you've got the cycles and it's easy, release the hounds on the tarball > that I just uploaded to: > > http://www.open-mpi.org/~jsquyres/unofficial/ > > Thanks! > > > > On Feb 2, 2015, at 7:19 PM, Paul Hargrove

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 4:13 PM, Paul Hargrove wrote: > HOWEVER - switching from PGI to GNU compilers made the problem go away. > So, I suspect it may be an issue with the installation/configuration of > the PGI compilers. > I've reproduced the problem on a non-Cray system wi

[OMPI devel] Build failure on OpenBSD (deja vu)

2015-02-02 Thread Paul Hargrove
The following comes from testing Jeff's no-embedded-libltdl work, but I suspect the same is true on tru^H^H^Hmaster. The output below, from "make V=1" shows a link failure from trying to use arc4random_addrandom(), which was removed on OpenBSD in late 2013. The part that bugs me is that I thought

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 5:22 PM, Paul Hargrove wrote: > So, the overhead for me is pretty small as long as the number of failures > is kept low. I jinxed it!!! I have, I believe, about 7 different failures now on various systems. All of those appear UNRELATED to the libltdl changes.

[OMPI devel] When libltdl is not your friend

2015-02-02 Thread Paul Hargrove
Below is one example of what happens when you assume that you can trust the libltdl installed an otherwise very well maintained national center. I think this is another "vote" for continuing to embed (a working) libltdl. -Paul $ mpirun -mca btl sm,self -np 2 examples/ring_c' libibverbs: Warning:

Re: [OMPI devel] When libltdl is not your friend

2015-02-02 Thread Paul Hargrove
Howard, This was seen on NERSC's Carver. -Paul On Mon, Feb 2, 2015 at 6:49 PM, Howard Pritchard wrote: > Hi Paul, > > Thanks for checking in depth into this. Just to help in determining how > to proceed, which national center is this? > > Howard > > > 2015-02-

[OMPI devel] Master build failure on Mac OS 10.8 with --enable-static/--disable-shared

2015-02-02 Thread Paul Hargrove
I have a Mac OSX 10.8 system, where cc is clang. I have no problems with a default build from the current master tarball. However, a static-only build leads to a link failure on opal_wrapper. Configured with --prefix=... --enable-debug CC=cc CXX=c++ --enable-static --disable-shared Failing port

[OMPI devel] Master failure building oshmem java examples

2015-02-02 Thread Paul Hargrove
On a system on which 1.8.4rc5 passed all my tests, I see the following running "make" in the examples directory: [...] make[2]: Leaving directory `/brashear/hargrove/OMPI/openmpi-master-linux-x86_64-java/BLD/examples' make[2]: Entering directory `/brashear/hargrove/OMPI/openmpi-master-linux-x86_64

<    2   3   4   5   6   7   8   9   10   >