[OMPI devel] 3.1.0 NEWS updates

2017-12-11 Thread Barrett, Brian via devel
All -

We’re preparing to start the 3.1.0 release process.  There have been a number 
of updates since the v3.0.x branch was created and we haven’t necessarily been 
great at updating the NEWS file.  I took a stab at the update, can everyone 
have a look and see what I missed?

  https://github.com/open-mpi/ompi/pull/4609

Thanks,

Brian
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] 3.0.1 NEWS updates

2017-12-11 Thread Barrett, Brian via devel
All -

We’re preparing to start the 3.0.1 release process.  There have been a number 
of updates since 3.0.0 and we haven’t necessarily been great at updating the 
NEWS file.  I took a stab at the update, can everyone have a look and see what 
I missed?

  https://github.com/open-mpi/ompi/pull/4607/files

Thanks,

Brian
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] btl/vader: osu_bibw hangs when the number of execution loops is increased

2017-12-11 Thread DERBEY, NADIA
Hi Nathan,

Actually PR # 3846 was enough to fix my hang: this fix went into 2.0.4 and we 
are still based on openmpi 2.0.2...

FYI, I also tested intra-node osu/pt2pt and osu/collective with PR #4569 on top 
of PR # 3846 : I didn't notice any regression.

Regards

On 12/07/2017 08:03 AM, Nadia DERBEY wrote:

Thanks Nathan, will keep you informed.

Regards

On 12/05/2017 11:32 PM, Nathan Hjelm wrote:
Should be fixed by PR #4569 (https://github.com/open-mpi/ompi/pull/4569). 
Please treat and let me know.

-Nathan

On Dec 1, 2017, at 7:37 AM, DERBEY, NADIA 
> wrote:

Hi,

Our validation team detected a hang when running osu_bibw
micro-benchmarks from the OMB 5.3 suite on openmpi 2.0.2 (note that the
same hang appears with openmpi-3.0).
This hang occurs when calling osu_bibw on a single node (vader btl) with
the options "-x 100 -i 1000".
The -x option changes the warmup loop size.
The -i option changes the measured loop size.

For each exchanged message size, osu_bibw loops doing the following
sequence on both ranks:
   . posts 64 non-blocking sends
   . posts 64 non-blocking receives
   . waits for all the send requests to complete
   . waits for all the receive requests to complete

The loop size is the sum of
   . options.skip (warm up phase that can be changed with the -x option)
   . options.loop (actually measured loop that can be changed with the
-i option).

The default values are the following:

+==+==+==+
| message size | skip | loop |
|==+==+==|
|<= 8K |   10 |  100 |
|>  8K |2 |   20 |
+==+==+==+

As said above, the test hangs when moving to more aggressive loop
values: 100 for skip and 1000 for loop.

mca_btl_vader_frag_alloc() calls opal_free_list_get() to get a fragment
from the appropriate free list.
If there are no free fragments anymore, opal_free_list_get() calls
opal_free_list_grow() which in turn calls mca_btl_vader_frag_init()
(initialization routine for the vader btl fragements).
This routine checks if there is enough space left in the mapped memory
segment for the wanted fragment size (current offset + fragment size
shoudl be <= segment size), and it makes opal_free_list_grow fail if the
shared memory segment is exhausted.

As soon as we begin exhausting memory, the 2 ranks get unsynchronized
and the test rapidly hangs. To avoid this hang, I found 2 possible
solutions:

1) change the vader btl segment size: I have set it to 4GB - in order to
be able to do this, I had to change the type parameter in the parameter
registrations to MCA_BASE_VAR_TYPE_SIZE_T.

2) change the call to opal_free_list_get() by a call to
opal_free_list_wait() in mca_btl_vader_frag_alloc(). This also makes the
micro-benchmark run to the end.

So my question is: what would be the best approach (#1 or #2)? and the
question behind this is: what is the reason that makes favoring
opal_free_list_get() instead of opal_free_list_wait().

Thanks

--
Nadia Derbey - B1-387
HPC R - MPI
Tel: +33 4 76 29 77 62
nadia.der...@atos.net
1 Rue de Provence BP 208
38130 Echirolles Cedex, France
www.atos.com
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel



___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


--
Nadia Derbey - B1-387
HPC R - MPI
Tel: +33 4 76 29 77 62
nadia.der...@atos.net
1 Rue de Provence BP 208
38130 Echirolles Cedex, France
www.atos.com


--
Nadia Derbey - B1-387
HPC R - MPI
Tel: +33 4 76 29 77 62
nadia.der...@atos.net
1 Rue de Provence BP 208
38130 Echirolles Cedex, France
www.atos.com
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] btl/vader: osu_bibw hangs when the number of execution loops is increased

2017-12-11 Thread DERBEY, NADIA
Thanks Nathan, will keep you informed.

Regards

On 12/05/2017 11:32 PM, Nathan Hjelm wrote:
Should be fixed by PR #4569 (https://github.com/open-mpi/ompi/pull/4569). 
Please treat and let me know.

-Nathan

On Dec 1, 2017, at 7:37 AM, DERBEY, NADIA 
> wrote:

Hi,

Our validation team detected a hang when running osu_bibw
micro-benchmarks from the OMB 5.3 suite on openmpi 2.0.2 (note that the
same hang appears with openmpi-3.0).
This hang occurs when calling osu_bibw on a single node (vader btl) with
the options "-x 100 -i 1000".
The -x option changes the warmup loop size.
The -i option changes the measured loop size.

For each exchanged message size, osu_bibw loops doing the following
sequence on both ranks:
   . posts 64 non-blocking sends
   . posts 64 non-blocking receives
   . waits for all the send requests to complete
   . waits for all the receive requests to complete

The loop size is the sum of
   . options.skip (warm up phase that can be changed with the -x option)
   . options.loop (actually measured loop that can be changed with the
-i option).

The default values are the following:

+==+==+==+
| message size | skip | loop |
|==+==+==|
|<= 8K |   10 |  100 |
|>  8K |2 |   20 |
+==+==+==+

As said above, the test hangs when moving to more aggressive loop
values: 100 for skip and 1000 for loop.

mca_btl_vader_frag_alloc() calls opal_free_list_get() to get a fragment
from the appropriate free list.
If there are no free fragments anymore, opal_free_list_get() calls
opal_free_list_grow() which in turn calls mca_btl_vader_frag_init()
(initialization routine for the vader btl fragements).
This routine checks if there is enough space left in the mapped memory
segment for the wanted fragment size (current offset + fragment size
shoudl be <= segment size), and it makes opal_free_list_grow fail if the
shared memory segment is exhausted.

As soon as we begin exhausting memory, the 2 ranks get unsynchronized
and the test rapidly hangs. To avoid this hang, I found 2 possible
solutions:

1) change the vader btl segment size: I have set it to 4GB - in order to
be able to do this, I had to change the type parameter in the parameter
registrations to MCA_BASE_VAR_TYPE_SIZE_T.

2) change the call to opal_free_list_get() by a call to
opal_free_list_wait() in mca_btl_vader_frag_alloc(). This also makes the
micro-benchmark run to the end.

So my question is: what would be the best approach (#1 or #2)? and the
question behind this is: what is the reason that makes favoring
opal_free_list_get() instead of opal_free_list_wait().

Thanks

--
Nadia Derbey - B1-387
HPC R - MPI
Tel: +33 4 76 29 77 62
nadia.der...@atos.net
1 Rue de Provence BP 208
38130 Echirolles Cedex, France
www.atos.com
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel



___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


--
Nadia Derbey - B1-387
HPC R - MPI
Tel: +33 4 76 29 77 62
nadia.der...@atos.net
1 Rue de Provence BP 208
38130 Echirolles Cedex, France
www.atos.com
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel