[hwloc-devel] hwloc nightly build: SUCCESS

2017-06-29 Thread mpiteam
Successful builds: ['v1.11', 'master']
Skipped builds: []
Failed builds: []

=== Build output ===

Branches: ['v1.11', 'master']

Starting build for v1.11
Found new revision 62e1d71
v1.11 build of revision 62e1d71 completed successfully

Starting build for master
Found new revision 0327ecf
Successfully submitted Coverity build
master build of revision 0327ecf completed successfully

Your friendly daemon,
Cyrador
___
hwloc-devel mailing list
hwloc-devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-devel


Re: [OMPI devel] Open MPI 3.0.0 first release candidate posted

2017-06-29 Thread Howard Pritchard
Brian,

Things look much better with this patch.  We need it for 3.0.0 release
The patch from 3794 applied cleanly from master.

Howard


2017-06-29 16:51 GMT-06:00 r...@open-mpi.org :

> I tracked down a possible source of the oob/tcp error - this should
> address it, I think: https://github.com/open-mpi/ompi/pull/3794
>
> On Jun 29, 2017, at 3:14 PM, Howard Pritchard  wrote:
>
> Hi Brian,
>
> I tested this rc using both srun native launch and mpirun on the following
> systems:
> - LANL CTS-1 systems (haswell + Intel OPA/PSM2)
> - LANL network testbed system (haswell  + connectX5/UCX and OB1)
> - LANL Cray XC
>
> I am finding some problems with mpirun on the network testbed system.
>
> For example, for spawn_with_env_vars from IBM tests:
>
> *** Error in `mpirun': corrupted double-linked list: 0x006e75b0 ***
>
> === Backtrace: =
>
> /usr/lib64/libc.so.6(+0x7bea2)[0x76597ea2]
>
> /usr/lib64/libc.so.6(+0x7cec6)[0x76598ec6]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(
> opal_proc_table_remove_all+0x91)[0x77855851]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_
> ud.so(+0x5e09)[0x73cc0e09]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_
> ud.so(+0x5952)[0x73cc0952]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(
> +0x6b032)[0x77b94032]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(
> mca_base_framework_close+0x7d)[0x7788592d]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_ess_hnp.so(+0x3e4d)[
> 0x75b04e4d]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(
> orte_finalize+0x79)[0x77b43bf9]
>
> mpirun[0x4014f1]
>
> mpirun[0x401018]
>
> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7653db15]
>
> mpirun[0x400f29]
>
> and another like
>
> [hpp@hi-master dynamic (master *)]$mpirun -np 1 ./spawn_with_env_vars
>
> Spawning...
>
> Spawned
>
> Child got foo and baz env variables -- yay!
>
> *** Error in `mpirun': corrupted double-linked list: 0x006eb350 ***
>
> === Backtrace: =
>
> /usr/lib64/libc.so.6(+0x7b184)[0x76597184]
>
> /usr/lib64/libc.so.6(+0x7d1ec)[0x765991ec]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_tcp.so(+0x57a2)[
> 0x732297a2]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_tcp.so(+0x5a87)[
> 0x73229a87]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(
> +0x6b032)[0x77b94032]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(
> mca_base_framework_close+0x7d)[0x7788592d]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_ess_hnp.so(+0x3e4d)[
> 0x75b04e4d]
>
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(
> orte_finalize+0x79)[0x77b43bf9]
>
> mpirun[0x4014f1]
>
> mpirun[0x401018]
>
> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7653db15]
>
> mpirun[0x400f29]
> It doesn't happen on every run though.
>
> I'll do some more investigating, but probably not till next week.
>
> Howard
>
>
> 2017-06-28 11:50 GMT-06:00 Barrett, Brian via devel <
> devel@lists.open-mpi.org>:
>
>> The first release candidate of Open MPI 3.0.0 is now available (
>> https://www.open-mpi.org/software/ompi/v3.0/).  We expect to have at
>> least one more release candidate, as there are still outstanding MPI-layer
>> issues to be resolved (particularly around one-sided).  We are posting
>> 3.0.0rc1 to get feedback on run-time stability, as one of the big features
>> of Open MPI 3.0 is the update to the PMIx 2 runtime environment.  We would
>> appreciate any and all testing you can do,  around run-time behaviors.
>>
>> Thank you,
>>
>> Brian & Howard
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>>
>
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
>
>
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] Open MPI 3.0.0 first release candidate posted

2017-06-29 Thread Howard Pritchard
Hi Brian,

I tested this rc using both srun native launch and mpirun on the following
systems:
- LANL CTS-1 systems (haswell + Intel OPA/PSM2)
- LANL network testbed system (haswell  + connectX5/UCX and OB1)
- LANL Cray XC

I am finding some problems with mpirun on the network testbed system.

For example, for spawn_with_env_vars from IBM tests:

*** Error in `mpirun': corrupted double-linked list: 0x006e75b0 ***

=== Backtrace: =

/usr/lib64/libc.so.6(+0x7bea2)[0x76597ea2]

/usr/lib64/libc.so.6(+0x7cec6)[0x76598ec6]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(opal_proc_table_remove_all+0x91)[0x77855851]

/home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_ud.so(+0x5e09)[0x73cc0e09]

/home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_ud.so(+0x5952)[0x73cc0952]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(+0x6b032)[0x77b94032]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(mca_base_framework_close+0x7d)[0x7788592d]

/home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_ess_hnp.so(+0x3e4d)[0x75b04e4d]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(orte_finalize+0x79)[0x77b43bf9]

mpirun[0x4014f1]

mpirun[0x401018]

/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7653db15]

mpirun[0x400f29]

and another like

[hpp@hi-master dynamic (master *)]$mpirun -np 1 ./spawn_with_env_vars

Spawning...

Spawned

Child got foo and baz env variables -- yay!

*** Error in `mpirun': corrupted double-linked list: 0x006eb350 ***

=== Backtrace: =

/usr/lib64/libc.so.6(+0x7b184)[0x76597184]

/usr/lib64/libc.so.6(+0x7d1ec)[0x765991ec]

/home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_tcp.so(+0x57a2)[0x732297a2]

/home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_tcp.so(+0x5a87)[0x73229a87]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(+0x6b032)[0x77b94032]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(mca_base_framework_close+0x7d)[0x7788592d]

/home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_ess_hnp.so(+0x3e4d)[0x75b04e4d]

/home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(orte_finalize+0x79)[0x77b43bf9]

mpirun[0x4014f1]

mpirun[0x401018]

/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7653db15]

mpirun[0x400f29]
It doesn't happen on every run though.

I'll do some more investigating, but probably not till next week.

Howard


2017-06-28 11:50 GMT-06:00 Barrett, Brian via devel <
devel@lists.open-mpi.org>:

> The first release candidate of Open MPI 3.0.0 is now available (
> https://www.open-mpi.org/software/ompi/v3.0/).  We expect to have at
> least one more release candidate, as there are still outstanding MPI-layer
> issues to be resolved (particularly around one-sided).  We are posting
> 3.0.0rc1 to get feedback on run-time stability, as one of the big features
> of Open MPI 3.0 is the update to the PMIx 2 runtime environment.  We would
> appreciate any and all testing you can do,  around run-time behaviors.
>
> Thank you,
>
> Brian & Howard
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] Open MPI 3.0.0 first release candidate posted

2017-06-29 Thread r...@open-mpi.org
I tracked down a possible source of the oob/tcp error - this should address it, 
I think: https://github.com/open-mpi/ompi/pull/3794 


> On Jun 29, 2017, at 3:14 PM, Howard Pritchard  wrote:
> 
> Hi Brian,
> 
> I tested this rc using both srun native launch and mpirun on the following 
> systems:
> - LANL CTS-1 systems (haswell + Intel OPA/PSM2)
> - LANL network testbed system (haswell  + connectX5/UCX and OB1)
> - LANL Cray XC
> 
> I am finding some problems with mpirun on the network testbed system.  
> 
> For example, for spawn_with_env_vars from IBM tests:
> 
> *** Error in `mpirun': corrupted double-linked list: 0x006e75b0 ***
> 
> === Backtrace: =
> 
> /usr/lib64/libc.so.6(+0x7bea2)[0x76597ea2]
> 
> /usr/lib64/libc.so.6(+0x7cec6)[0x76598ec6]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(opal_proc_table_remove_all+0x91)[0x77855851]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_ud.so(+0x5e09)[0x73cc0e09]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_ud.so(+0x5952)[0x73cc0952]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(+0x6b032)[0x77b94032]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(mca_base_framework_close+0x7d)[0x7788592d]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_ess_hnp.so(+0x3e4d)[0x75b04e4d]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(orte_finalize+0x79)[0x77b43bf9]
> 
> mpirun[0x4014f1]
> 
> mpirun[0x401018]
> 
> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7653db15]
> 
> mpirun[0x400f29]
> 
> 
> and another like
> 
> [hpp@hi-master dynamic (master *)]$mpirun -np 1 ./spawn_with_env_vars
> 
> Spawning...
> 
> Spawned
> 
> Child got foo and baz env variables -- yay!
> 
> *** Error in `mpirun': corrupted double-linked list: 0x006eb350 ***
> 
> === Backtrace: =
> 
> /usr/lib64/libc.so.6(+0x7b184)[0x76597184]
> 
> /usr/lib64/libc.so.6(+0x7d1ec)[0x765991ec]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_tcp.so(+0x57a2)[0x732297a2]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_oob_tcp.so(+0x5a87)[0x73229a87]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(+0x6b032)[0x77b94032]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-pal.so.40(mca_base_framework_close+0x7d)[0x7788592d]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/openmpi/mca_ess_hnp.so(+0x3e4d)[0x75b04e4d]
> 
> /home/hpp/openmpi_3.0.0rc1_install/lib/libopen-rte.so.40(orte_finalize+0x79)[0x77b43bf9]
> 
> mpirun[0x4014f1]
> 
> mpirun[0x401018]
> 
> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7653db15]
> 
> mpirun[0x400f29]
> 
> It doesn't happen on every run though.
> 
> I'll do some more investigating, but probably not till next week.
> 
> Howard
> 
> 
> 2017-06-28 11:50 GMT-06:00 Barrett, Brian via devel  >:
> The first release candidate of Open MPI 3.0.0 is now available 
> (https://www.open-mpi.org/software/ompi/v3.0/ 
> ).  We expect to have at least 
> one more release candidate, as there are still outstanding MPI-layer issues 
> to be resolved (particularly around one-sided).  We are posting 3.0.0rc1 to 
> get feedback on run-time stability, as one of the big features of Open MPI 
> 3.0 is the update to the PMIx 2 runtime environment.  We would appreciate any 
> and all testing you can do,  around run-time behaviors.
> 
> Thank you,
> 
> Brian & Howard
> ___
> devel mailing list
> devel@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel 
> 
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel