date:20151014

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Ralph Castain

Okay, please try the attached patch. It will cause two messages to be
output for each job: one indicating the job has been marked terminated, and
the other reporting that the completion message was sent to the requestor.
Let's see what that tells us.

Thanks
Ralph


On Wed, Oct 14, 2015 at 3:44 PM, Mark Santcroos 
wrote:

> Hi Ralph,
>
> > On 15 Oct 2015, at 0:26 , Ralph Castain  wrote:
> > Okay, so each orte-submit is reporting job has launched, which means the
> hang is coming while waiting to hear the job completed. Are you sure that
> orte-dvm believes the job has completed?
>
> No, I'm not.
>
> > In other words, when you say that you observe the job as completing, are
> you basing that on some output from orte-dvm, or because the procs have
> exited, or...?
>
> ... because the tasks have created their output.
>
> > I can send you a patch tonight that would cause orte-dvm to emit a "job
> completed" message when it determines each job has terminated - might help
> us take the next step.
>
> Great.
>
> > I'm wondering if orte-dvm thinks the job is still running, and the race
> condition is in that area (as opposed to being in orte-submit itself)
>
> Do some counts from the output of orte-dvm provide some hints?
>
>
> $ grep "Releasing job data.*INVALID" dvm_output.txt |wc -l
>   42
>
> $ grep "ORTE_DAEMON_SPAWN_JOB_CMD" dvm_output.txt |wc -l
>   42
>
> $ grep "ORTE_DAEMON_ADD_LOCAL_PROCS" dvm_output.txt |wc -l
>   42
>
> $ grep "sess_dir_finalize" dvm_output.txt |wc -l
>   35
>
>
> In other words, the "[netbook:] sess_dir_finalize: proc session dir
> does not exist" message doesn't show up for the hanging ones, which could
> support your question that the orte-dvm is at fault.
>
> Gr,
>
> Mark
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/10/18171.php
>
diff --git a/orte/mca/state/dvm/state_dvm.c b/orte/mca/state/dvm/state_dvm.c
index 0e7309c..5b1a841 100644
--- a/orte/mca/state/dvm/state_dvm.c
+++ b/orte/mca/state/dvm/state_dvm.c
@@ -267,6 +267,7 @@ void check_complete(int fd, short args, void *cbdata)
 if (jdata->state < ORTE_JOB_STATE_UNTERMINATED) {
 jdata->state = ORTE_JOB_STATE_TERMINATED;
 }
+opal_output(0, "%s JOB %s HAS TERMINATED", 
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME), ORTE_JOBID_PRINT(jdata->jobid));
 }
 
 /* tell the IOF that the job is complete */
diff --git a/orte/tools/orte-dvm/orte-dvm.c b/orte/tools/orte-dvm/orte-dvm.c
index 3cdf585..003f93a 100644
--- a/orte/tools/orte-dvm/orte-dvm.c
+++ b/orte/tools/orte-dvm/orte-dvm.c
@@ -442,18 +442,6 @@ int main(int argc, char *argv[])
 exit(orte_exit_status);
 }
 
-static void send_callback(int status, orte_process_name_t *peer,
-  opal_buffer_t* buffer, orte_rml_tag_t tag,
-  void* cbdata)
-
-{
-orte_job_t *jdata = (orte_job_t*)cbdata;
-
-OBJ_RELEASE(buffer);
-/* cleanup the job object */
-opal_pointer_array_set_item(orte_job_data, ORTE_LOCAL_JOBID(jdata->jobid), 
NULL);
-OBJ_RELEASE(jdata);
-}
 static void notify_requestor(int sd, short args, void *cbdata)
 {
 orte_state_caddy_t *caddy = (orte_state_caddy_t*)cbdata;
@@ -462,6 +450,11 @@ static void notify_requestor(int sd, short args, void 
*cbdata)
 int ret;
 opal_buffer_t *reply;
 
+opal_output(0, "%s NOTIFYING %s OF JOB %s COMPLETION",
+ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
+ORTE_NAME_PRINT(>originator),
+ORTE_JOBID_PRINT(jdata->jobid));
+
 /* notify the requestor */
 reply = OBJ_NEW(opal_buffer_t);
 /* see if there was any problem */
@@ -471,11 +464,13 @@ static void notify_requestor(int sd, short args, void 
*cbdata)
 ret = 0;
 }
 opal_dss.pack(reply, , 1, OPAL_INT);
-orte_rml.send_buffer_nb(>originator, reply, ORTE_RML_TAG_TOOL, 
send_callback, jdata);
+orte_rml.send_buffer_nb(>originator, reply, ORTE_RML_TAG_TOOL, 
orte_rml_send_callback, NULL);
+
+/* flag that we were notified */
+jdata->state = ORTE_JOB_STATE_NOTIFIED;
+/* send us back thru job complete */
+ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_TERMINATED);
 
-/* we cannot cleanup the job object as we might
- * hit an error during transmission, so clean it
- * up in the send callback */
 OBJ_RELEASE(caddy);
 }

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

2015-10-14 Thread Gilles Gouaillardet


Folks,

i made PR #1028 https://github.com/open-mpi/ompi/pull/1028

it is not 100% clean (so i will not commit it before a review)
since opal/mca/pmix/pmix1xx/pmix/configure is now invoked
with two CPPFLAGS=... on the command line:
- first one comes from the ompi configure command line
- second one (this is the one used) is set by 
opal/mca/pmix/pmix1xx/configure.m4)


Cheers,

Gilles

On 10/14/2015 3:37 PM, Gilles Gouaillardet wrote:

Folks,

i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my
configure command line.
here is what happens :
opal/mca/pmix/pmix1xx/configure.m4 set the CPPFLAGS environment
variable with -I/tmp and include paths for hwloc and libevent
then opal/mca/pmix/pmix1xx/pmix/configure is invoked with
CPPFLAGS=-I/tmp on the command line
the CPPFLAGS environment variable is simply ignored, and only -I/tmp
is used, which causes the compilation failure reported by Siegmar.

at this stage, i do not know the best way to solve this issue :
one option is not to pass CPPFLAGS=-I/tmp to the sub configure
an other option is not to set the CPPFLAGS environment variable but
invoke the sub configure with "CPPFLAGS=$CPPFLAGS"
note this issue might not be limited to CPPFLAGS handling

could you please advise on how to move forward ?

Cheers,

Gilles

On Wed, Oct 7, 2015 at 4:42 PM, Siegmar Gross
 wrote:

Hi,

I tried to build openmpi-v2.x-dev-415-g5c9b192 and
openmpi-dev-2696-gd579a07 on my machines (Solaris 10 Sparc, Solaris 10
x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13.
I got the following error on all platforms with gcc and with Sun C only
on my Linux machine. I've already reported the problem September 8th
for the master trunk (at that time I didn't have the problem for the
v2.x trunk. I use the following configure command.

../openmpi-dev-2696-gd579a07/configure \
   --prefix=/usr/local/openmpi-master_64_gcc \
   --libdir=/usr/local/openmpi-master_64_gcc/lib64 \
   --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
   --with-jdk-headers=/usr/local/jdk1.8.0/include \
   JAVA_HOME=/usr/local/jdk1.8.0 \
   LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
   CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \
   CPP="cpp" CXXCPP="cpp" \
   CPPFLAGS="" CXXCPPFLAGS="" \
   --enable-mpi-cxx \
   --enable-cxx-exceptions \
   --enable-mpi-java \
   --enable-heterogeneous \
   --enable-mpi-thread-multiple \
   --with-hwloc=internal \
   --without-verbs \
   --with-wrapper-cflags="-std=c11 -m64" \
   --with-wrapper-cxxflags="-m64" \
   --with-wrapper-fcflags="-m64" \
   --enable-debug \
   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc


openmpi-v2.x-dev-415-g5c9b192:
==

linpc1 openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc 135 tail -15
log.make.Linux.x86_64.64_gcc
   CC   src/class/pmix_pointer_array.lo
   CC   src/class/pmix_hash_table.lo
   CC   src/include/pmix_globals.lo
In file included from
../../../../../../openmpi-v2.x-dev-415-g5c9b192/opal/mca/pmix/pmix1xx/pmix/src/include/pmix_globals.c:19:0:
/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192/opal/mca/pmix/pmix1xx/pmix/include/private/types.h:43:27:
fatal error: opal/mca/event/libevent2022/libevent2022.h: No such file or
directory
compilation terminated.
make[4]: *** [src/include/pmix_globals.lo] Error 1
make[4]: Leaving directory
`/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory
`/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory
`/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal'
make: *** [all-recursive] Error 1
linpc1 openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc 135


openmpi-dev-2696-gd579a07:
==

linpc1 openmpi-dev-2696-gd579a07-Linux.x86_64.64_gcc 158 tail -15
log.make.Linux.x86_64.64_gcc
   CC   src/class/pmix_pointer_array.lo
   CC   src/class/pmix_hash_table.lo
   CC   src/include/pmix_globals.lo
In file included from
../../../../../../openmpi-dev-2696-gd579a07/opal/mca/pmix/pmix1xx/pmix/src/include/pmix_globals.c:19:0:
/export2/src/openmpi-master/openmpi-dev-2696-gd579a07/opal/mca/pmix/pmix1xx/pmix/include/private/types.h:43:27:
fatal error: opal/mca/event/libevent2022/libevent2022.h: No such file or
directory
compilation terminated.
make[4]: *** [src/include/pmix_globals.lo] Error 1
make[4]: Leaving directory
`/export2/src/openmpi-master/openmpi-dev-2696-gd579a07-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory

[hwloc-devel] Create success (hwloc git 1.11.0-91-g010b4b6)

2015-10-14 Thread MPI Team

Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc 1.11.0-91-g010b4b6
Start time: Wed Oct 14 21:06:24 EDT 2015
End time:   Wed Oct 14 21:08:02 EDT 2015

Your friendly daemon,
Cyrador

[hwloc-devel] Create success (hwloc git 1.10.1-71-g48f9ddd)

2015-10-14 Thread MPI Team

Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc 1.10.1-71-g48f9ddd
Start time: Wed Oct 14 21:04:51 EDT 2015
End time:   Wed Oct 14 21:06:23 EDT 2015

Your friendly daemon,
Cyrador

[hwloc-devel] Create success (hwloc git 1.9.1-66-ga20252d)

2015-10-14 Thread MPI Team

Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc 1.9.1-66-ga20252d
Start time: Wed Oct 14 21:03:05 EDT 2015
End time:   Wed Oct 14 21:04:51 EDT 2015

Your friendly daemon,
Cyrador

[hwloc-devel] Create success (hwloc git dev-811-gdaaf59f)

2015-10-14 Thread MPI Team

Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc dev-811-gdaaf59f
Start time: Wed Oct 14 21:01:02 EDT 2015
End time:   Wed Oct 14 21:02:55 EDT 2015

Your friendly daemon,
Cyrador

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Larry Baker

The INTEGER*n, LOGICAL*n, REAL*n, etc., syntax has never been legal Fortran.  
Fortran originally had only INTEGER, REAL, DOUBLE PRECISION, and COMPLEX 
numeric types.  Fortran 90 added the notion of a KIND of numeric, but left 
unspecified the mapping of numeric KINDs to processor-specific storage.  KIND 
can be thought of as an opaque identifier.  There is no requirement, for 
example that KIND n means a variable occupies n bytes of storage, though this 
is commonly done.  (As is the association of KIND=1 to REAL and KIND=2 to 
DOUBLE PRECISION.)  Instead, the language provides portable means of specifying 
the desired behavior of an available KIND, such as digits of precision.  
Unfortunately, when marshalling data for interchange, bits matter—the number 
and their meaning.  High-level languages don't support such concepts very well. 
Starting  with C99 (Section 7.18.1), C forces the compiler implementation to 
define macros for supported integer widths (in bits).  However, like Fortran, 
there is no requirement that any exact number of bits be supported (Section 
7.18.1.1); the standard only requires integer types with a minimum of 8, 16, 
32, and 64 bits (Section 7.18.1.2).  Nothing is said at all about 
floating-point data types and the correspondence with the integer types.  This 
is what APIs like OpenMPI have to struggle with in the real world.

Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov

On 14 Oct 2015, at 3:38 PM, Jeff Squyres (jsquyres) wrote:

> On Oct 14, 2015, at 5:53 PM, Vladimír Fuka  wrote:
>> 
>>> As that ticket notes if REAL*16 <> long double Open MPI should be
>>> disabling redutions on MPI_REAL16. I can take a look and see if I can
>>> determine why that is not working as expected.
>> 
>> Does it really need to be just disabled when the `real(real128)` is
>> actually equivalent to c_long_double? Wouldn't making the explicit
>> interfaces to MPI_Send and others to accept `real(real128)` make more
>> sense? As I wrote in the stackoverflow post, the MPI standard (3.1,
>> pages 628 and 674) is not very clear if MPI_REAL16 corresponds to
>> real*16 or real(real128) if these differ, but making it correspond to
>> real(real128) might be reasonable.
> 
> As I understand it, real*16 is not a real type -- it's a commonly-used type 
> and supported by many (all?) compilers, but it's not actually defined in the 
> Fortran spec.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/10/18170.php

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Mark Santcroos

Hi Ralph,

> On 15 Oct 2015, at 0:26 , Ralph Castain  wrote:
> Okay, so each orte-submit is reporting job has launched, which means the hang 
> is coming while waiting to hear the job completed. Are you sure that orte-dvm 
> believes the job has completed?

No, I'm not.

> In other words, when you say that you observe the job as completing, are you 
> basing that on some output from orte-dvm, or because the procs have exited, 
> or...?

... because the tasks have created their output.

> I can send you a patch tonight that would cause orte-dvm to emit a "job 
> completed" message when it determines each job has terminated - might help us 
> take the next step.

Great.

> I'm wondering if orte-dvm thinks the job is still running, and the race 
> condition is in that area (as opposed to being in orte-submit itself)

Do some counts from the output of orte-dvm provide some hints?


$ grep "Releasing job data.*INVALID" dvm_output.txt |wc -l
  42

$ grep "ORTE_DAEMON_SPAWN_JOB_CMD" dvm_output.txt |wc -l
  42

$ grep "ORTE_DAEMON_ADD_LOCAL_PROCS" dvm_output.txt |wc -l
  42

$ grep "sess_dir_finalize" dvm_output.txt |wc -l
  35


In other words, the "[netbook:] sess_dir_finalize: proc session dir does 
not exist" message doesn't show up for the hanging ones, which could support 
your question that the orte-dvm is at fault.

Gr,

Mark

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Jeff Squyres (jsquyres)

On Oct 14, 2015, at 5:53 PM, Vladimír Fuka  wrote:
> 
>> As that ticket notes if REAL*16 <> long double Open MPI should be
>> disabling redutions on MPI_REAL16. I can take a look and see if I can
>> determine why that is not working as expected.
> 
> Does it really need to be just disabled when the `real(real128)` is
> actually equivalent to c_long_double? Wouldn't making the explicit
> interfaces to MPI_Send and others to accept `real(real128)` make more
> sense? As I wrote in the stackoverflow post, the MPI standard (3.1,
> pages 628 and 674) is not very clear if MPI_REAL16 corresponds to
> real*16 or real(real128) if these differ, but making it correspond to
> real(real128) might be reasonable.

As I understand it, real*16 is not a real type -- it's a commonly-used type and 
supported by many (all?) compilers, but it's not actually defined in the 
Fortran spec.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Ralph Castain

Okay, so each orte-submit is reporting job has launched, which means the
hang is coming while waiting to hear the job completed. Are you sure that
orte-dvm believes the job has completed? In other words, when you say that
you observe the job as completing, are you basing that on some output from
orte-dvm, or because the procs have exited, or...?

I can send you a patch tonight that would cause orte-dvm to emit a "job
completed" message when it determines each job has terminated - might help
us take the next step. I'm wondering if orte-dvm thinks the job is still
running, and the race condition is in that area (as opposed to being in
orte-submit itself)

On Wed, Oct 14, 2015 at 1:01 PM, Mark Santcroos 
wrote:

> Hi Ralph,
> > On 14 Oct 2015, at 21:50 , Ralph Castain  wrote:
> > I wonder if they might be getting duplicate process names if started
> quickly enough. Do you get the "job has launched" message (orte-submit
> outputs a message after orte-dvm responds that the job launched)?
>
> Based on the output below I would say that both columns with IDs are
> unique.
>
> Thanks
>
> Mark
>
> $ head orte-log.txt
> [netbook:90327] Job [24532,1] has launched
> [netbook:90326] Job [24532,2] has launched
> [netbook:90331] Job [24532,3] has launched
> [netbook:90330] Job [24532,4] has launched
> [netbook:90332] Job [24532,5] has launched
> [netbook:90328] Job [24532,6] has launched
> [netbook:90329] Job [24532,7] has launched
> [netbook:90325] Job [24532,8] has launched
> [netbook:90335] Job [24532,9] has launched
> [netbook:90333] Job [24532,10] has launched
>
> $ cat orte-log.txt | cut -f1 -d" "| sort | uniq -c | wc -l
>   42
> $ cat orte-log.txt | cut -f3 -d" "| sort | uniq -c | wc -l
>   42
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/10/18167.php
>

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Vladimír Fuka

> As that ticket notes if REAL*16 <> long double Open MPI should be
> disabling redutions on MPI_REAL16. I can take a look and see if I can
> determine why that is not working as expected.

Does it really need to be just disabled when the `real(real128)` is
actually equivalent to c_long_double? Wouldn't making the explicit
interfaces to MPI_Send and others to accept `real(real128)` make more
sense? As I wrote in the stackoverflow post, the MPI standard (3.1,
pages 628 and 674) is not very clear if MPI_REAL16 corresponds to
real*16 or real(real128) if these differ, but making it correspond to
real(real128) might be reasonable.

   Vladimir

2015-10-14 14:40 GMT+01:00 Vladimír Fuka :
> Hello,
>
>   I have a problem with using the  quadruple (128bit) or extended
> (80bit) precision reals in Fortran. I did my tests with gfortran-4.8.5
> and OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2), but others confirmed
> this behaviour for more recent versions at
> http://stackoverflow.com/questions/33109040/strange-result-of-mpi-allreduce-for-16-byte-real?noredirect=1#comment54060649_33109040
> .
>
>   When I try to use REAL*16 variables (or equivalent kind-based
> definition) and MPI_REAL16 the reductions don't give correct results
> (see the link for the exact code). I was pointed to this issue ticket
> https://github.com/open-mpi/ompi/issues/63.
>
> I thought, maybe the underlying long double is 80-bit extended
> precision then and I tried to use REAL*10 variables and MPI_REAL16. I
> actually received a correct answer from the reduction, but when I
> tried to use REAL*10 or REAL(10) I am getting
>
> Error: There is no specific subroutine for the generic 'mpi_recv' at (1)
> Error: There is no specific subroutine for the generic 'mpi_ssend' at (1)
>
> That is strange, because I should be able to use even types and array
> ranks which I construct myself in point to point send/receives and
> which are unknown to the MPI library, so the explicit interface should
> not be required.
>
> Is there a correct way how to use the extended or quadruple precision
> in OpenMPI? My intended usage is mainly checking if differences seen
> numerical computations are getting smaller with increasing precision
> and can therefore be attributed to rounding errors. If not they could
> be a sign of a bug.
>
>Best regards,
>
>   Vladimir

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Mark Santcroos

Hi Ralph,
> On 14 Oct 2015, at 21:50 , Ralph Castain  wrote:
> I wonder if they might be getting duplicate process names if started quickly 
> enough. Do you get the "job has launched" message (orte-submit outputs a 
> message after orte-dvm responds that the job launched)?

Based on the output below I would say that both columns with IDs are unique.

Thanks

Mark

$ head orte-log.txt 
[netbook:90327] Job [24532,1] has launched
[netbook:90326] Job [24532,2] has launched
[netbook:90331] Job [24532,3] has launched
[netbook:90330] Job [24532,4] has launched
[netbook:90332] Job [24532,5] has launched
[netbook:90328] Job [24532,6] has launched
[netbook:90329] Job [24532,7] has launched
[netbook:90325] Job [24532,8] has launched
[netbook:90335] Job [24532,9] has launched
[netbook:90333] Job [24532,10] has launched

$ cat orte-log.txt | cut -f1 -d" "| sort | uniq -c | wc -l
  42
$ cat orte-log.txt | cut -f3 -d" "| sort | uniq -c | wc -l
  42

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Ralph Castain

I wonder if they might be getting duplicate process names if started
quickly enough. Do you get the "job has launched" message (orte-submit
outputs a message after orte-dvm responds that the job launched)?



On Wed, Oct 14, 2015 at 12:04 PM, Mark Santcroos  wrote:

> Hi,
>
> By hammering on a DVM with orte-submit I can reproducibly make orte-submit
> not return, but hang instead.
> The task is executed correctly though.
>
> It can be reproduced using the small snippet below.
> Switching from sequential to "concurrent" execution of the orte-submit's
> triggers the effect.
>
> Note that when I ctrl-c the orte-submit, I can re-use the DVM, so my hunch
> would be that its a client-side issue.
>
> What MCA logging parameters might give more insight of whats happening?
>
> Thanks!
>
> Mark
>
>
>
> $ cat > orte_test.sh
> #!/bin/sh
>
> for i in $(seq 42):
> do
> # GOOD
> #orte-submit --hnp file:dvm_uri -np 1 /bin/date
>
> # BAD
> orte-submit --hnp file:dvm_uri -np 1 /bin/date &
> done
> wait
> ^D
> $ chmod +x orte_test.sh
> $ orte-dvm --report-uri dvm_uri &
> DVM ready
> $ ./orte_test.sh
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/10/18165.php
>

[OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Mark Santcroos

Hi,

By hammering on a DVM with orte-submit I can reproducibly make orte-submit not 
return, but hang instead.
The task is executed correctly though.

It can be reproduced using the small snippet below.
Switching from sequential to "concurrent" execution of the orte-submit's 
triggers the effect.

Note that when I ctrl-c the orte-submit, I can re-use the DVM, so my hunch 
would be that its a client-side issue.

What MCA logging parameters might give more insight of whats happening?

Thanks!

Mark



$ cat > orte_test.sh
#!/bin/sh

for i in $(seq 42):
do
# GOOD
#orte-submit --hnp file:dvm_uri -np 1 /bin/date

# BAD
orte-submit --hnp file:dvm_uri -np 1 /bin/date &
done
wait
^D
$ chmod +x orte_test.sh
$ orte-dvm --report-uri dvm_uri &
DVM ready
$ ./orte_test.sh

Re: [OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

2015-10-14 Thread Jeff Squyres (jsquyres)

On Oct 14, 2015, at 12:48 PM, Nathan Hjelm  wrote:
> 
> I think this is from a known issue. Try applying this and run again:
> 
> https://github.com/open-mpi/ompi/commit/952d01db70eab4cbe11ff4557434acaa928685a4.patch

The good news is that if this fixes your problem, the fix is already included 
in the upcoming v1.10.1 release.


> -Nathan
> 
> On Wed, Oct 14, 2015 at 06:33:07PM +0200, Paul Kapinos wrote:
>> Dear Open MPI developer,
>> 
>> We're puzzled by reproducible performance (bandwidth) penalty observed when
>> comparing measurements via InfibiBand between two nodes, OpenMPI/1.10.0
>> compiled with *GCC/5.2* instead of GCC 4.8 and Intel compiler.
>> 
>> Take a look at the attached picture of two measurements of NetPIPE
>> http://bitspjoule.org/netpipe/ benchmark done with one MPI rank per node,
>> communicating via QDR InfiniBand (y axis: Mbps, y axis: sample number)
>> 
>> Up to sample 64 (8195 bytes message size) the achieved performance is
>> virtually the same; from sample 65 (12285 bytes, *less* than 12k) the
>> version of GCC compiled using GCC 5.2 suffer form 20%+ penalty in bandwidth.
>> 
>> The result is reproducible and independent from nodes and ever linux
>> distribution (both Scientific Linux 6 and CentOS 7 have the same results).
>> Both C and Fortran benchmarks offer the very same behaviour so it is *not*
>> an f08 issue.
>> 
>> The acchieved bandwidth is definitely IB-range (gigabytes per second), the
>> communication is running via InfinfiBand in all cases (no failback to IP,
>> huh).
>> 
>> The compile line is the same; the output of ompi_info --all and --params is
>> the very same (cf. attachments) up to added support for fortran-08 in /5.2
>> version.
>> 
>> We know about existence of 'eager_limit' parameter, which is *not* changed
>> and is 12288 in both versions (this is *less* that the first distinguishing
>> sample).
>> 
>> Again, for us the *only* difference is usage of other (new) GCC release.
>> 
>> Any idea about this 20%+ bandwidth loss?
>> 
>> Best
>> 
>> Paul Kapinos
>> -- 
>> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
>> RWTH Aachen University, IT Center
>> Seffenter Weg 23,  D 52074  Aachen (Germany)
>> Tel: +49 241/80-24915
> 
> 
>> MCA btl: parameter "btl_openib_verbose" (current value: 
>> "false", data source: default, level: 9 dev/all, type: bool)
>>  Output some verbose OpenIB BTL information (0 = no 
>> output, nonzero = output)
>>  Valid values: 0: f|false|disabled, 1: t|true|enabled
>> MCA btl: parameter "btl_openib_warn_no_device_params_found" 
>> (current value: "true", data source: default, level: 9 dev/all, type: bool, 
>> synonyms: btl_openib_warn_no_hca_params_found)
>>  Warn when no device-specific parameters are found 
>> in the INI file specified by the btl_openib_device_param_files MCA parameter 
>> (0 = do not warn; any other value = warn)
>>  Valid values: 0: f|false|disabled, 1: t|true|enabled
>> MCA btl: parameter "btl_openib_warn_no_hca_params_found" 
>> (current value: "true", data source: default, level: 9 dev/all, type: bool, 
>> deprecated, synonym of: btl_openib_warn_no_device_params_found)
>>  Warn when no device-specific parameters are found 
>> in the INI file specified by the btl_openib_device_param_files MCA parameter 
>> (0 = do not warn; any other value = warn)
>>  Valid values: 0: f|false|disabled, 1: t|true|enabled
>> MCA btl: parameter "btl_openib_warn_default_gid_prefix" 
>> (current value: "true", data source: default, level: 9 dev/all, type: bool)
>>  Warn when there is more than one active ports and 
>> at least one of them connected to the network with only default GID prefix 
>> configured (0 = do not warn; any other value = warn)
>>  Valid values: 0: f|false|disabled, 1: t|true|enabled
>> MCA btl: parameter "btl_openib_warn_nonexistent_if" (current 
>> value: "true", data source: default, level: 9 dev/all, type: bool)
>>  Warn if non-existent devices and/or ports are 
>> specified in the btl_openib_if_[in|ex]clude MCA parameters (0 = do not warn; 
>> any other value = warn)
>>  Valid values: 0: f|false|disabled, 1: t|true|enabled
>> MCA btl: parameter "btl_openib_abort_not_enough_reg_mem" 
>> (current value: "false", data source: default, level: 9 dev/all, type: bool)
>>  If there is not enough registered memory available 
>> on the system for Open MPI to function properly, Open MPI will issue a 
>> warning.  If this MCA parameter is set to true, then Open MPI will also 
>> abort all MPI jobs (0 = warn, but do not abort; any other value = warn and 
>> abort)
>>  Valid values: 0: f|false|disabled, 1:

Re: [OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

2015-10-14 Thread Nathan Hjelm


I think this is from a known issue. Try applying this and run again:

https://github.com/open-mpi/ompi/commit/952d01db70eab4cbe11ff4557434acaa928685a4.patch

-Nathan

On Wed, Oct 14, 2015 at 06:33:07PM +0200, Paul Kapinos wrote:
> Dear Open MPI developer,
> 
> We're puzzled by reproducible performance (bandwidth) penalty observed when
> comparing measurements via InfibiBand between two nodes, OpenMPI/1.10.0
> compiled with *GCC/5.2* instead of GCC 4.8 and Intel compiler.
> 
> Take a look at the attached picture of two measurements of NetPIPE
> http://bitspjoule.org/netpipe/ benchmark done with one MPI rank per node,
> communicating via QDR InfiniBand (y axis: Mbps, y axis: sample number)
> 
> Up to sample 64 (8195 bytes message size) the achieved performance is
> virtually the same; from sample 65 (12285 bytes, *less* than 12k) the
> version of GCC compiled using GCC 5.2 suffer form 20%+ penalty in bandwidth.
> 
> The result is reproducible and independent from nodes and ever linux
> distribution (both Scientific Linux 6 and CentOS 7 have the same results).
> Both C and Fortran benchmarks offer the very same behaviour so it is *not*
> an f08 issue.
> 
> The acchieved bandwidth is definitely IB-range (gigabytes per second), the
> communication is running via InfinfiBand in all cases (no failback to IP,
> huh).
> 
> The compile line is the same; the output of ompi_info --all and --params is
> the very same (cf. attachments) up to added support for fortran-08 in /5.2
> version.
> 
> We know about existence of 'eager_limit' parameter, which is *not* changed
> and is 12288 in both versions (this is *less* that the first distinguishing
> sample).
> 
> Again, for us the *only* difference is usage of other (new) GCC release.
> 
> Any idea about this 20%+ bandwidth loss?
> 
> Best
> 
> Paul Kapinos
> -- 
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, IT Center
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915


>  MCA btl: parameter "btl_openib_verbose" (current value: 
> "false", data source: default, level: 9 dev/all, type: bool)
>   Output some verbose OpenIB BTL information (0 = no 
> output, nonzero = output)
>   Valid values: 0: f|false|disabled, 1: t|true|enabled
>  MCA btl: parameter "btl_openib_warn_no_device_params_found" 
> (current value: "true", data source: default, level: 9 dev/all, type: bool, 
> synonyms: btl_openib_warn_no_hca_params_found)
>   Warn when no device-specific parameters are found 
> in the INI file specified by the btl_openib_device_param_files MCA parameter 
> (0 = do not warn; any other value = warn)
>   Valid values: 0: f|false|disabled, 1: t|true|enabled
>  MCA btl: parameter "btl_openib_warn_no_hca_params_found" 
> (current value: "true", data source: default, level: 9 dev/all, type: bool, 
> deprecated, synonym of: btl_openib_warn_no_device_params_found)
>   Warn when no device-specific parameters are found 
> in the INI file specified by the btl_openib_device_param_files MCA parameter 
> (0 = do not warn; any other value = warn)
>   Valid values: 0: f|false|disabled, 1: t|true|enabled
>  MCA btl: parameter "btl_openib_warn_default_gid_prefix" 
> (current value: "true", data source: default, level: 9 dev/all, type: bool)
>   Warn when there is more than one active ports and 
> at least one of them connected to the network with only default GID prefix 
> configured (0 = do not warn; any other value = warn)
>   Valid values: 0: f|false|disabled, 1: t|true|enabled
>  MCA btl: parameter "btl_openib_warn_nonexistent_if" (current 
> value: "true", data source: default, level: 9 dev/all, type: bool)
>   Warn if non-existent devices and/or ports are 
> specified in the btl_openib_if_[in|ex]clude MCA parameters (0 = do not warn; 
> any other value = warn)
>   Valid values: 0: f|false|disabled, 1: t|true|enabled
>  MCA btl: parameter "btl_openib_abort_not_enough_reg_mem" 
> (current value: "false", data source: default, level: 9 dev/all, type: bool)
>   If there is not enough registered memory available 
> on the system for Open MPI to function properly, Open MPI will issue a 
> warning.  If this MCA parameter is set to true, then Open MPI will also abort 
> all MPI jobs (0 = warn, but do not abort; any other value = warn and abort)
>   Valid values: 0: f|false|disabled, 1: t|true|enabled
>  MCA btl: parameter "btl_openib_poll_cq_batch" (current 
> value: "256", data source: default, level: 9 dev/all, type: unsigned)
>   Retrieve up to poll_cq_batch completions from CQ
>  MCA btl: parameter

[OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

2015-10-14 Thread Paul Kapinos


Dear Open MPI developer,

We're puzzled by reproducible performance (bandwidth) penalty observed when 
comparing measurements via InfibiBand between two nodes, OpenMPI/1.10.0 compiled 
with *GCC/5.2* instead of GCC 4.8 and Intel compiler.


Take a look at the attached picture of two measurements of NetPIPE 
http://bitspjoule.org/netpipe/ benchmark done with one MPI rank per node, 
communicating via QDR InfiniBand (y axis: Mbps, y axis: sample number)


Up to sample 64 (8195 bytes message size) the achieved performance is virtually 
the same; from sample 65 (12285 bytes, *less* than 12k) the version of GCC 
compiled using GCC 5.2 suffer form 20%+ penalty in bandwidth.


The result is reproducible and independent from nodes and ever linux 
distribution (both Scientific Linux 6 and CentOS 7 have the same results). Both 
C and Fortran benchmarks offer the very same behaviour so it is *not* an f08 issue.


The acchieved bandwidth is definitely IB-range (gigabytes per second), the 
communication is running via InfinfiBand in all cases (no failback to IP, huh).


The compile line is the same; the output of ompi_info --all and --params is the 
very same (cf. attachments) up to added support for fortran-08 in /5.2 version.


We know about existence of 'eager_limit' parameter, which is *not* changed and 
is 12288 in both versions (this is *less* that the first distinguishing sample).


Again, for us the *only* difference is usage of other (new) GCC release.

Any idea about this 20%+ bandwidth loss?

Best

Paul Kapinos
--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915
 MCA btl: parameter "btl_openib_verbose" (current value: 
"false", data source: default, level: 9 dev/all, type: bool)
  Output some verbose OpenIB BTL information (0 = no 
output, nonzero = output)
  Valid values: 0: f|false|disabled, 1: t|true|enabled
 MCA btl: parameter "btl_openib_warn_no_device_params_found" 
(current value: "true", data source: default, level: 9 dev/all, type: bool, 
synonyms: btl_openib_warn_no_hca_params_found)
  Warn when no device-specific parameters are found in 
the INI file specified by the btl_openib_device_param_files MCA parameter (0 = 
do not warn; any other value = warn)
  Valid values: 0: f|false|disabled, 1: t|true|enabled
 MCA btl: parameter "btl_openib_warn_no_hca_params_found" 
(current value: "true", data source: default, level: 9 dev/all, type: bool, 
deprecated, synonym of: btl_openib_warn_no_device_params_found)
  Warn when no device-specific parameters are found in 
the INI file specified by the btl_openib_device_param_files MCA parameter (0 = 
do not warn; any other value = warn)
  Valid values: 0: f|false|disabled, 1: t|true|enabled
 MCA btl: parameter "btl_openib_warn_default_gid_prefix" 
(current value: "true", data source: default, level: 9 dev/all, type: bool)
  Warn when there is more than one active ports and at 
least one of them connected to the network with only default GID prefix 
configured (0 = do not warn; any other value = warn)
  Valid values: 0: f|false|disabled, 1: t|true|enabled
 MCA btl: parameter "btl_openib_warn_nonexistent_if" (current 
value: "true", data source: default, level: 9 dev/all, type: bool)
  Warn if non-existent devices and/or ports are 
specified in the btl_openib_if_[in|ex]clude MCA parameters (0 = do not warn; 
any other value = warn)
  Valid values: 0: f|false|disabled, 1: t|true|enabled
 MCA btl: parameter "btl_openib_abort_not_enough_reg_mem" 
(current value: "false", data source: default, level: 9 dev/all, type: bool)
  If there is not enough registered memory available on 
the system for Open MPI to function properly, Open MPI will issue a warning.  
If this MCA parameter is set to true, then Open MPI will also abort all MPI 
jobs (0 = warn, but do not abort; any other value = warn and abort)
  Valid values: 0: f|false|disabled, 1: t|true|enabled
 MCA btl: parameter "btl_openib_poll_cq_batch" (current value: 
"256", data source: default, level: 9 dev/all, type: unsigned)
  Retrieve up to poll_cq_batch completions from CQ
 MCA btl: parameter "btl_openib_device_param_files" (current 
value: 
"/opt/MPI/openmpi-1.10.0/linux/gcc/share/openmpi/mca-btl-openib-device-params.ini",
 data source: default, level: 9 dev/all, type: string, synonyms: 
btl_openib_hca_param_files)
  Colon-delimited list of INI-style files that contain 
device vendor/part-specific parameters (use semicolon for Windows)

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Nathan Hjelm


On Wed, Oct 14, 2015 at 02:40:00PM +0100, Vladimír Fuka wrote:
> Hello,
> 
>   I have a problem with using the  quadruple (128bit) or extended
> (80bit) precision reals in Fortran. I did my tests with gfortran-4.8.5
> and OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2), but others confirmed
> this behaviour for more recent versions at
> http://stackoverflow.com/questions/33109040/strange-result-of-mpi-allreduce-for-16-byte-real?noredirect=1#comment54060649_33109040
> .
> 
>   When I try to use REAL*16 variables (or equivalent kind-based
> definition) and MPI_REAL16 the reductions don't give correct results
> (see the link for the exact code). I was pointed to this issue ticket
> https://github.com/open-mpi/ompi/issues/63.

As that ticket notes if REAL*16 <> long double Open MPI should be
disabling redutions on MPI_REAL16. I can take a look and see if I can
determine why that is not working as expected.

> Is there a correct way how to use the extended or quadruple precision
> in OpenMPI? My intended usage is mainly checking if differences seen
> numerical computations are getting smaller with increasing precision
> and can therefore be attributed to rounding errors. If not they could
> be a sign of a bug.

Take a look at the following article:

http://dl.acm.org/citation.cfm?id=1988419=553203244=11814269

You may be able to use the method described to get the enhanced
precision you need.

-Nathan
HPC-5, LANL


pgp3p5D1g27uS.pgp
Description: PGP signature

[OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Vladimír Fuka

Hello,

I have a problem with using the quadruple (128bit) or extended
(80bit) precision reals in Fortran. I did my tests with gfortran-4.8.5
and OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2), but others confirmed
this behaviour for more recent versions at
http://stackoverflow.com/questions/33109040/strange-result-of-mpi-allreduce-for-16-byte-real?noredirect=1#comment54060649_33109040
.

When I try to use REAL*16 variables (or equivalent kind-based
definition) and MPI_REAL16 the reductions don't give correct results
(see the link for the exact code). I was pointed to this issue ticket
https://github.com/open-mpi/ompi/issues/63.

I thought, maybe the underlying long double is 80-bit extended
precision then and I tried to use REAL*10 variables and MPI_REAL16. I
actually received a correct answer from the reduction, but when I
tried to use REAL*10 or REAL(10) I am getting

Error: There is no specific subroutine for the generic 'mpi_recv' at (1)
Error: There is no specific subroutine for the generic 'mpi_ssend' at (1)

That is strange, because I should be able to use even types and array
ranks which I construct myself in point to point send/receives and
which are unknown to the MPI library, so the explicit interface should
not be required.

Is there a correct way how to use the extended or quadruple precision
in OpenMPI? My intended usage is mainly checking if differences seen
numerical computations are getting smaller with increasing precision
and can therefore be attributed to rounding errors. If not they could
be a sign of a bug.

Best regards,

Vladimir

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

2015-10-14 Thread Gilles Gouaillardet

Folks,

i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my
configure command line.
here is what happens :
opal/mca/pmix/pmix1xx/configure.m4 set the CPPFLAGS environment
variable with -I/tmp and include paths for hwloc and libevent
then opal/mca/pmix/pmix1xx/pmix/configure is invoked with
CPPFLAGS=-I/tmp on the command line
the CPPFLAGS environment variable is simply ignored, and only -I/tmp
is used, which causes the compilation failure reported by Siegmar.

at this stage, i do not know the best way to solve this issue :
one option is not to pass CPPFLAGS=-I/tmp to the sub configure
an other option is not to set the CPPFLAGS environment variable but
invoke the sub configure with "CPPFLAGS=$CPPFLAGS"
note this issue might not be limited to CPPFLAGS handling

could you please advise on how to move forward ?

Cheers,

Gilles

On Wed, Oct 7, 2015 at 4:42 PM, Siegmar Gross
 wrote:
> Hi,
>
> I tried to build openmpi-v2.x-dev-415-g5c9b192 and
> openmpi-dev-2696-gd579a07 on my machines (Solaris 10 Sparc, Solaris 10
> x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13.
> I got the following error on all platforms with gcc and with Sun C only
> on my Linux machine. I've already reported the problem September 8th
> for the master trunk (at that time I didn't have the problem for the
> v2.x trunk. I use the following configure command.
>
> ../openmpi-dev-2696-gd579a07/configure \
>   --prefix=/usr/local/openmpi-master_64_gcc \
>   --libdir=/usr/local/openmpi-master_64_gcc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
>   --with-jdk-headers=/usr/local/jdk1.8.0/include \
>   JAVA_HOME=/usr/local/jdk1.8.0 \
>   LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
>   CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   CPPFLAGS="" CXXCPPFLAGS="" \
>   --enable-mpi-cxx \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-mpi-thread-multiple \
>   --with-hwloc=internal \
>   --without-verbs \
>   --with-wrapper-cflags="-std=c11 -m64" \
>   --with-wrapper-cxxflags="-m64" \
>   --with-wrapper-fcflags="-m64" \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc
>
>
> openmpi-v2.x-dev-415-g5c9b192:
> ==
>
> linpc1 openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc 135 tail -15
> log.make.Linux.x86_64.64_gcc
>   CC   src/class/pmix_pointer_array.lo
>   CC   src/class/pmix_hash_table.lo
>   CC   src/include/pmix_globals.lo
> In file included from
> ../../../../../../openmpi-v2.x-dev-415-g5c9b192/opal/mca/pmix/pmix1xx/pmix/src/include/pmix_globals.c:19:0:
> /export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192/opal/mca/pmix/pmix1xx/pmix/include/private/types.h:43:27:
> fatal error: opal/mca/event/libevent2022/libevent2022.h: No such file or
> directory
> compilation terminated.
> make[4]: *** [src/include/pmix_globals.lo] Error 1
> make[4]: Leaving directory
> `/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
> make[3]: *** [all-recursive] Error 1
> make[3]: Leaving directory
> `/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
> make[2]: *** [all-recursive] Error 1
> make[2]: Leaving directory
> `/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory
> `/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc/opal'
> make: *** [all-recursive] Error 1
> linpc1 openmpi-v2.x-dev-415-g5c9b192-Linux.x86_64.64_gcc 135
>
>
> openmpi-dev-2696-gd579a07:
> ==
>
> linpc1 openmpi-dev-2696-gd579a07-Linux.x86_64.64_gcc 158 tail -15
> log.make.Linux.x86_64.64_gcc
>   CC   src/class/pmix_pointer_array.lo
>   CC   src/class/pmix_hash_table.lo
>   CC   src/include/pmix_globals.lo
> In file included from
> ../../../../../../openmpi-dev-2696-gd579a07/opal/mca/pmix/pmix1xx/pmix/src/include/pmix_globals.c:19:0:
> /export2/src/openmpi-master/openmpi-dev-2696-gd579a07/opal/mca/pmix/pmix1xx/pmix/include/private/types.h:43:27:
> fatal error: opal/mca/event/libevent2022/libevent2022.h: No such file or
> directory
> compilation terminated.
> make[4]: *** [src/include/pmix_globals.lo] Error 1
> make[4]: Leaving directory
> `/export2/src/openmpi-master/openmpi-dev-2696-gd579a07-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
> make[3]: *** [all-recursive] Error 1
> make[3]: Leaving directory
> `/export2/src/openmpi-master/openmpi-dev-2696-gd579a07-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx/pmix'
> make[2]: *** [all-recursive] Error 1
> make[2]: Leaving directory
> `/export2/src/openmpi-master/openmpi-dev-2696-gd579a07-Linux.x86_64.64_gcc/opal/mca/pmix/pmix1xx'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory
>

Re: [OMPI devel] orte-dvm / orte-submit race condition

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

[hwloc-devel] Create success (hwloc git 1.11.0-91-g010b4b6)

[hwloc-devel] Create success (hwloc git 1.10.1-71-g48f9ddd)

[hwloc-devel] Create success (hwloc git 1.9.1-66-ga20252d)

[hwloc-devel] Create success (hwloc git dev-811-gdaaf59f)

Re: [OMPI devel] 16 byte real in Fortran

Re: [OMPI devel] orte-dvm / orte-submit race condition

Re: [OMPI devel] 16 byte real in Fortran

Re: [OMPI devel] orte-dvm / orte-submit race condition

Re: [OMPI devel] 16 byte real in Fortran

Re: [OMPI devel] orte-dvm / orte-submit race condition

Re: [OMPI devel] orte-dvm / orte-submit race condition

[OMPI devel] orte-dvm / orte-submit race condition

Re: [OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

Re: [OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

[OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

Re: [OMPI devel] 16 byte real in Fortran

[OMPI devel] 16 byte real in Fortran

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

20 matches

Site Navigation

Mail list logo

Footer information