Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Gilles Gouaillardet
Paul, from the logs, the only difference i see is about Fortran PROCEDURE. openpmi 1.8 (svn checkout) does not build the usempif08 bindings if PROCEDURE is not supported. from the logs, openmpi 1.8.1 does not check whether PROCEDURE is supported or not here is the sample program to check PROCED

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Giles, If you look more carefully at the output I provided you will see that 1.8.1 *does* test for PROCEDURE support and finds it lacking. BOTH outputs include: checking if Fortran compiler supports PROCEDURE... no However in the 1.8.1 case that is apparently not sufficient to disqualify buildi

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On Tue, Jul 29, 2014 at 9:09 PM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: > openpmi 1.8 (svn checkout) does not build the usempif08 bindings if > PROCEDURE is not supported. > I have just verified that this requirement for PROCEDURE support is a change in behavior between 1.8.

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Gilles Gouaillardet
Paul, i am sorry i missed that. and you are right, 1.8.1 and 1.8 from svn differs : from svn (config/ompi_setup_mpi_fortran.m4) # Per https://svn.open-mpi.org/trac/ompi/ticket/4590, if the # Fortran compiler doesn't support PROCEDURE in the way we # want/need, disable the mpi_f08 mod

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On a related topic: I configured with an explicit --enable-mpi-fortran=usempif08. Then configure found PROCEDURE was missing/broken. The result is that the build continued, but without the requested f08 support. If the user has explicitly enabled a given level of Fortran support, but it cannot be

Re: [OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Gilles Gouaillardet
George, #4815 is indirectly related to the move : in bcol/basesmuma, we used to compare ompi_process_name_t, and now we (try to) compare an ompi_process_name_t and an opal_process_name_t (which causes a glory SIGSEGV) i proposed a temporary patch which is both broken and unelegant, could you ple

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread tmishima
Hi Jeff, Sorry for poor information and late reply. Today, I attended a very very long meeting ... Anyway, I attached compile-output and configure-log. (due to file size limitation, I send them in twice) I hope you could find the problem. (See attached file: openmpi-1.8-pgi14.7.tar.gz) Regar

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread tmishima
This is another one. (See attached file: openmpi-1.8.2rc2-pgi14.7.tar.gz) Tetusya > Tetsuya -- > > I am unable to test with the PGI compiler -- I don't have a license. I was hoping that LANL would be able to test today, but I don't think they got to it. > > Can you send more details? > > E.g.

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread tmishima
Hi Paul, thank you for your comment. I don't think my mpi_f08.mod is older one, because the time stamp is equal to the time when I rebuilt them today. [mishima@manage openmpi-1.8.2rc2-pgi14.7]$ ll lib/mpi* -rwxr-xr-x 1 mishima mishima315 Jul 30 12:27 lib/mpi_ext.mod -rwxr-xr-x 1 mishima mis

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Gilles Gouaillardet
Paul, this is a fair point. i commited r32354 in order to abort configure in this case Cheers, Gilles On 2014/07/30 15:11, Paul Hargrove wrote: > On a related topic: > > I configured with an explicit --enable-mpi-fortran=usempif08. > Then configure found PROCEDURE was missing/broken. > The res

Re: [OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Rolf vandeVaart
Just an FYI that my trunk version (r32355) does not work at all anymore if I do not include "--mca coll ^ml".Here is a stack trace from the ibm/pt2pt/send test running on a single node. (gdb) where #0 0x7f6c0d1321d0 in ?? () #1 #2 0x7f6c183abd52 in orte_util_compare_name_fie

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Gilles GOUAILLARDET
Rolf, r32353 can be seen as a suspect... Even if it is correct, it might have exposed the bug discussed in #4815 even more (e.g. we hit the bug 100% after the fix) does the attached patch to #4815 fixes the problem ? If yes, and if you see this issue as a showstopper, feel free to commit it and

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Ralph Castain
I just fixed this one - all that was required was an ampersand as the name was being passed into the function instead of a pointer to the name r32357 On Jul 30, 2014, at 7:43 AM, Gilles GOUAILLARDET wrote: > Rolf, > > r32353 can be seen as a suspect... > Even if it is correct, it might have

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Ralph Castain
Ummthis really broke things now. I can't build the fortran bindings at all, and I don't have a PGI compiler. I also didn't specify a level of Fortran support, but just had --enable-mpi-fortran Maybe we need to revert this commit until we figure out a better solution? On Jul 30, 2014, at 12:

Re: [OMPI devel] MPI_T SEGV on DSO

2014-07-30 Thread Nathan Hjelm
This is odd. The variable in question is registered by the MCA itself. I will take a look and see if I can determine why it isn't being deregistered correctly when the rest of the component's parameters are. -Nathan On Wed, Jul 30, 2014 at 08:17:15AM +0900, KAWASHIMA Takahiro wrote: > Nathan, >

Re: [OMPI devel] OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Gilles GOUAILLARDET
I will fix this tomorrow Right now, --enable-mpi-fortran is --enable-mpi-fortran=yes is --enable-mpi-fortran=all : So configure aborts if not all bindings can be built In ompi_configure_options.m4 : OMPI_FORTRAN_USER_REQUESTED=0 108 case "x$enable_mpi_fortran" in 109 x) 110

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Rolf vandeVaart
Thanks Ralph and Gilles! All is looking good for me again. I think all tests are passing again. Will check results again tomorrow. From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Wednesday, July 30, 2014 10:49 AM To: Open MPI Developers Subject: Re: [OMPI devel

Re: [OMPI devel] MPI_T SEGV on DSO

2014-07-30 Thread KAWASHIMA Takahiro
Nathan, The diffrences seems to be the flags on registering. Normal MCA variables shmem_sysv_priority etc. have flag MCA_BASE_VAR_FLAG_DWG so that they are deregistered through mca_base_var_group_deregister in mca_base_component_unload. But shmem_sysv_major_version doesn't have the flag. Regard

Re: [OMPI devel] MPI_T SEGV on DSO

2014-07-30 Thread Nathan Hjelm
Yup, just noticed that. All component variables should be registered with mca_base_component_var_register but the versions were registered with the generic register function. The code in question is the oldest part of the MCA rewrite so it probably was missed when the component variable register f

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Gilles Gouaillardet
Ralph, was it really that simple ? proc_temp->super.proc_name has type opal_process_name_t : typedef opal_identifier_t opal_process_name_t; typedef uint64_t opal_identifier_t; *but* item_ptr->peer has type orte_process_name_t : struct orte_process_name_t { orte_jobid_t jobid; orte_vpid_t

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread George Bosilca
Yes. opal_process_name_t has basically no meaning by itself, it is a 64 bits storage location used by the upper layer to save some local key that can be later used to extract information. Calling the OPAL level compare function might be a better fit there. George. On Wed, Jul 30, 2014 at 11:5

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Ralph Castain
Yeah, my fix won't work for big endian machines - this is going to be an issue across the code base now, so we'll have to troll and fix it. I was doing the minimal change required to fix the trunk in the meantime. On Jul 30, 2014, at 9:06 AM, George Bosilca wrote: > Yes. opal_process_name_t ha

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread George Bosilca
No, this is not going to be an issue if the opal_identifier_t is used correctly (aka only via the exposed accessors). George. On Wed, Jul 30, 2014 at 12:09 PM, Ralph Castain wrote: > Yeah, my fix won't work for big endian machines - this is going to be an > issue across the code base now, s

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread Ralph Castain
George - my point was that we regularly tested using the method in that routine, and now we have to do something a little different. So it is an "issue" in that we have to make changes across the code base to ensure we do things the "new" way, that's all On Jul 30, 2014, at 9:17 AM, George Bosi

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-07-30 Thread George Bosilca
The underlying structure changed, so a little bit of fiddling is normal. Instead of using a field in the ompi_proc_t you are now using a field down in opal_proc_t, a field that simply cannot have the same type as before (orte_process_name_t). George. On Wed, Jul 30, 2014 at 12:19 PM, Ralph Ca

Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-07-30 Thread George Bosilca
Why do you want to add new versions? This will lead to having two, almost identical, sets of atomics that are conceptually equivalent but different in terms of code. And we will have to maintained both! I did a similar change in a fork of OPAL in another project but instead of adding another flavo

Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-07-30 Thread Nathan Hjelm
That is what I would prefer. I was trying to not disturb things too much :). Please bring the changes over! -Nathan On Wed, Jul 30, 2014 at 03:18:44PM -0400, George Bosilca wrote: >Why do you want to add new versions? This will lead to having two, almost >identical, sets of atomics that

[OMPI devel] mca_PROJECT_FRAMEWORK_COMPONENT_symbol vs. mca_FRAMEWORK_COMPONENT_symbol

2014-07-30 Thread Dave Goodell (dgoodell)
Jeff and I were talking about some namespacing issues that have come up in the recent BTL move from OMPI to OPAL. AFAIK, the current system for namespacing external symbols is to name them "mca_FRAMEWORK_COMPONENT_symbol" (e.g., "mca_btl_tcp_add_procs" in the tcp BTL). Similarly, the DSO for t

Re: [OMPI devel] mca_PROJECT_FRAMEWORK_COMPONENT_symbol vs. mca_FRAMEWORK_COMPONENT_symbol

2014-07-30 Thread Ralph Castain
We've run into the same problem with frameworks in different projects having overlapping names, let alone symbols. So if you have an easy solution, please go for it. What we need is for not only the symbols, but the mca libs to contain the project names so they don't overlap each other. On Jul

[OMPI devel] RFC: job size info in OPAL

2014-07-30 Thread Jeff Squyres (jsquyres)
WHAT: Should we make the job size (i.e., initial number of procs) available in OPAL? WHY: At least 2 BTLs are using this info (*more below) WHERE: usnic and ugni TIMEOUT: there's already been some inflammatory emails about this; let's discuss next Tuesday on the teleconf: Tue, 5 Aug 2014 MORE

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Jeff Squyres (jsquyres)
On Jul 30, 2014, at 12:36 AM, Paul Hargrove wrote: > Unfortunately, this (and https://svn.open-mpi.org/trac/ompi/changeset/31588 > that followed) represent a REGRESSION in that between 1.8.1 and 1.8.2rc2 Open > MPI has lost support for F08 with the PGI compilers. Yes, and the answer is for PGI

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Jeff, I am not "screaming" for a return of support for the PGI compilers. I will also note that "use mpi" works fine; only the F2008 support is lacking. Rather than complain I am offering to help test any solution that might be offered. I will also note that Nathan and Howard both have accounts a

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Tetsuya, I found that the behavior of pgf90 changed somewhere between versions 13.6 and 14.1. My previous reports were mostly based on my testing of 13.6. So, I have probably been seeing an issue entirely different than yours. I am testing 14.4 now and hope to be able to reproduce the problem you

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Jeff Squyres (jsquyres)
On Jul 28, 2014, at 11:43 PM, tmish...@jcity.maeda.co.jp wrote: > [mishima@manage work]$ mpif90 test.f -o test.ex > /tmp/pgfortran65ZcUeoncoqT.o: In function `.C1_283': > test.f:(.data+0x6c): undefined reference to `mpi_f08_interfaces_callbacks_' > test.f:(.data+0x74): undefined reference to `mpi_

Re: [OMPI devel] RFC: job size info in OPAL

2014-07-30 Thread George Bosilca
On Jul 30, 2014, at 18:00 , Jeff Squyres (jsquyres) wrote: > WHAT: Should we make the job size (i.e., initial number of procs) available > in OPAL? > > WHY: At least 2 BTLs are using this info (*more below) > > WHERE: usnic and ugni > > TIMEOUT: there's already been some inflammatory emails

Re: [OMPI devel] RFC: job size info in OPAL

2014-07-30 Thread Ralph Castain
On Jul 30, 2014, at 5:25 PM, George Bosilca wrote: > > On Jul 30, 2014, at 18:00 , Jeff Squyres (jsquyres) > wrote: > >> WHAT: Should we make the job size (i.e., initial number of procs) available >> in OPAL? >> >> WHY: At least 2 BTLs are using this info (*more below) >> >> WHERE: usnic

Re: [OMPI devel] mca_PROJECT_FRAMEWORK_COMPONENT_symbol vs. mca_FRAMEWORK_COMPONENT_symbol

2014-07-30 Thread George Bosilca
I can also picture an environment where different projects can supply component that would technically belong to a framework from another project. Let me take an example. Imagine we decide to keep the RML-based connection setup for SM, thing that is not currently possible in the OPAL layer. In this

Re: [OMPI devel] RFC: job size info in OPAL

2014-07-30 Thread George Bosilca
On Jul 30, 2014, at 20:37 , Ralph Castain wrote: > > On Jul 30, 2014, at 5:25 PM, George Bosilca wrote: > >> >> On Jul 30, 2014, at 18:00 , Jeff Squyres (jsquyres) >> wrote: >> >>> WHAT: Should we make the job size (i.e., initial number of procs) available >>> in OPAL? >>> >>> WHY: At l

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread tmishima
Paul and Jeff, I additionally installed PGI14.4 and check the behavior. Then, I confirmed that both versions create same results. PGI14.7: [mishima@manage work]$ mpif90 test.f -o test.ex --showme pgfortran test.f -o test.ex -I/home/mishima/opt/mpi/openmpi-1.8.2rc2-pgi14.7/include -I/home/mishim

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Jeff, I can now reproduce Tetsuya's original problem, using a build of 1.8.2rc2 with PGI 14.4. $ INST/bin/mpifort ../test.f /scratch/scratchdirs/hargrove/pgf90pdegT3bhBmEq.o: In function `.C1_283': test.f:(.data+0x6c): undefined reference to `mpi_f08_interfaces_callbacks_' test.f:(.data+0x74): u

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On Wed, Jul 30, 2014 at 6:15 PM, wrote: [...] > Strange thing is that openmpi-1.8 with PGI14.7 works fine. > What's the difference with openmpi-1.8 and openmpi-1.8.2rc2? > [...] Tetsuya, Now that I can reproduce the problem you have reported, I am building 1.8.1 with PGI14.4. Then I may be able

Re: [OMPI devel] RFC: job size info in OPAL

2014-07-30 Thread Ralph Castain
On Jul 30, 2014, at 5:49 PM, George Bosilca wrote: > > On Jul 30, 2014, at 20:37 , Ralph Castain wrote: > >> >> On Jul 30, 2014, at 5:25 PM, George Bosilca wrote: >> >>> >>> On Jul 30, 2014, at 18:00 , Jeff Squyres (jsquyres) >>> wrote: >>> WHAT: Should we make the job size (i.e.,

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On Wed, Jul 30, 2014 at 6:20 PM, Paul Hargrove wrote: > > On Wed, Jul 30, 2014 at 6:15 PM, wrote: > [...] > >> Strange thing is that openmpi-1.8 with PGI14.7 works fine. >> What's the difference with openmpi-1.8 and openmpi-1.8.2rc2? >> > [...] > > Tetsuya, > > Now that I can reproduce the probl