Re: [OMPI users] MPI_type_free question

2020-12-15 Thread Patrick Bégou via users
Issue #8290 reported.
Thanks all for your help and the workaround provided.

Patrick

Le 14/12/2020 à 17:40, Jeff Squyres (jsquyres) a écrit :
> Yes, opening an issue would be great -- thanks!
>
>
>> On Dec 14, 2020, at 11:32 AM, Patrick Bégou via users
>> mailto:users@lists.open-mpi.org>> wrote:
>>
>> OK, Thanks Gilles.
>> Does it still require that I open an issue for tracking ?
>>
>> Patrick
>>
>> Le 14/12/2020 à 14:56, Gilles Gouaillardet via users a écrit :
>>> Hi Patrick,
>>>
>>> Glad to hear you are now able to move forward.
>>>
>>> Please keep in mind this is not a fix but a temporary workaround.
>>> At first glance, I did not spot any issue in the current code.
>>> It turned out that the memory leak disappeared when doing things
>>> differently
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users
>>> mailto:users@lists.open-mpi.org>> wrote:
>>>
>>> Hi Gilles,
>>>
>>> you catch the bug! With this patch, on a single node, the memory
>>> leak disappear. The cluster is actualy overloaded, as soon as
>>> possible I will launch a multinode test.
>>> Below the memory used by rank 0 before (blue) and after (red)
>>> the patch.
>>>
>>> Thanks
>>>
>>> Patrick
>>>
>>> 
>>>
>>> Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
 Patrick,


 First, thank you very much for sharing the reproducer.


 Yes, please open a github issue so we can track this.


 I cannot fully understand where the leak is coming from, but so
 far

  - the code fails on master built with --enable-debug (the data
 engine reports an error) but not with the v3.1.x branch

   (this suggests there could be an error in the latest Open MPI
 ... or in the code)

  - the attached patch seems to have a positive effect, can you
 please give it a try?


 Cheers,


 Gilles



 On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
> Hi,
>
> I've written a small piece of code to show the problem. Based
> on my application but 2D and using integers arrays for testing.
> The  figure below shows the max RSS size of rank 0 process on
> 2 iterations on 8 and 16 cores, with openib and tcp drivers.
> The more processes I have, the larger the memory leak.  I use
> the same binaries for the 4 runs and OpenMPI 3.1 (same
> behavior with 4.0.5).
> The code is in attachment. I'll try to check type deallocation
> as soon as possible.
>
> Patrick
>
>
>
>
> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>> Patrick,
>>
>>
>> based on George's idea, a simpler check is to retrieve the
>> Fortran index via the (standard) MPI_Type_c2() function
>>
>> after you create a derived datatype.
>>
>>
>> If the index keeps growing forever even after you
>> MPI_Type_free(), then this clearly indicates a leak.
>>
>> Unfortunately, this simple test cannot be used to definitely
>> rule out any memory leak.
>>
>>
>> Note you can also
>>
>> mpirun --mca pml ob1 --mca btl tcp,self ...
>>
>> in order to force communications over TCP/IP and hence rule
>> out any memory leak that could be triggered by your fast
>> interconnect.
>>
>>
>>
>> In any case, a reproducer will greatly help us debugging this
>> issue.
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>>
>> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>>> Patrick,
>>>
>>> I'm afraid there is no simple way to check this. The main
>>> reason being that OMPI use handles for MPI objects, and
>>> these handles are not tracked by the library, they are
>>> supposed to be provided by the user for each call. In
>>> your case, as you already called MPI_Type_free on the
>>> datatype, you cannot produce a valid handle.
>>>
>>> There might be a trick. If the datatype is manipulated with
>>> any Fortran MPI functions, then we convert the handle (which
>>> in fact is a pointer) to an index into a pointer array
>>> structure. Thus, the index will remain used, and can
>>> therefore be used to convert back into a valid datatype
>>> pointer, until OMPI completely releases the datatype. Look
>>> into the ompi_datatype_f_to_c_table table to see the
>>> datatypes that exist and get their pointers, and then use
>>> these pointers as arguments to ompi_datatype_dump() to see
>>> if any of these existing datatypes are the ones you define.
>>>
>>> George.
>>>
>>>
>>>

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Jeff Squyres (jsquyres) via users
Yes, opening an issue would be great -- thanks!


On Dec 14, 2020, at 11:32 AM, Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:

OK, Thanks Gilles.
Does it still require that I open an issue for tracking ?

Patrick

Le 14/12/2020 à 14:56, Gilles Gouaillardet via users a écrit :
Hi Patrick,

Glad to hear you are now able to move forward.

Please keep in mind this is not a fix but a temporary workaround.
At first glance, I did not spot any issue in the current code.
It turned out that the memory leak disappeared when doing things differently

Cheers,

Gilles

On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:
Hi Gilles,

you catch the bug! With this patch, on a single node, the memory leak 
disappear. The cluster is actualy overloaded, as soon as possible I will launch 
a multinode test.
Below the memory used by rank 0 before (blue) and after (red) the patch.

Thanks

Patrick



Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
Patrick,


First, thank you very much for sharing the reproducer.


Yes, please open a github issue so we can track this.


I cannot fully understand where the leak is coming from, but so far

 - the code fails on master built with --enable-debug (the data engine reports 
an error) but not with the v3.1.x branch

  (this suggests there could be an error in the latest Open MPI ... or in the 
code)

 - the attached patch seems to have a positive effect, can you please give it a 
try?


Cheers,


Gilles



On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
Hi,

I've written a small piece of code to show the problem. Based on my application 
but 2D and using integers arrays for testing.
The  figure below shows the max RSS size of rank 0 process on 2 iterations 
on 8 and 16 cores, with openib and tcp drivers.
The more processes I have, the larger the memory leak.  I use the same binaries 
for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
The code is in attachment. I'll try to check type deallocation as soon as 
possible.

Patrick




Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
Patrick,


based on George's idea, a simpler check is to retrieve the Fortran index via 
the (standard) MPI_Type_c2() function

after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), then this 
clearly indicates a leak.

Unfortunately, this simple test cannot be used to definitely rule out any 
memory leak.


Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any memory leak 
that could be triggered by your fast interconnect.



In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:
Patrick,

I'm afraid there is no simple way to check this. The main reason being that 
OMPI use handles for MPI objects, and these handles are not tracked by the 
library, they are supposed to be provided by the user for each call. In your 
case, as you already called MPI_Type_free on the datatype, you cannot produce a 
valid handle.

There might be a trick. If the datatype is manipulated with any Fortran MPI 
functions, then we convert the handle (which in fact is a pointer) to an index 
into a pointer array structure. Thus, the index will remain used, and can 
therefore be used to convert back into a valid datatype pointer, until OMPI 
completely releases the datatype. Look into the ompi_datatype_f_to_c_table 
table to see the datatypes that exist and get their pointers, and then use 
these pointers as arguments to ompi_datatype_dump() to see if any of these 
existing datatypes are the ones you define.

George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org> 
> wrote:

Hi,

I'm trying to solve a memory leak since my new implementation of
communications based on MPI_AllToAllW and MPI_type_Create_SubArray
calls.  Arrays of SubArray types are created/destroyed at each
time step and used for communications.

On my laptop the code runs fine (running for 15000 temporal
itérations on 32 processes with oversubscription) but on our
cluster memory used by the code increase until the OOMkiller stop
the job. On the cluster we use IB QDR for communications.

Same Gcc/Gfortran 7.3 (built from sources), same sources of
OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
the laptop and on the cluster.

Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
show the problem (resident memory do not increase and we ran
10 temporal iterations)

MPI_type_free manual says that it "/Marks the datatype object
associated with datatype for deallocation/". But  how can I check
that the deallocation is really done ?

Thanks for 

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Patrick Bégou via users
OK, Thanks Gilles.
Does it still require that I open an issue for tracking ?

Patrick

Le 14/12/2020 à 14:56, Gilles Gouaillardet via users a écrit :
> Hi Patrick,
>
> Glad to hear you are now able to move forward.
>
> Please keep in mind this is not a fix but a temporary workaround.
> At first glance, I did not spot any issue in the current code.
> It turned out that the memory leak disappeared when doing things
> differently
>
> Cheers,
>
> Gilles
>
> On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users
> mailto:users@lists.open-mpi.org>> wrote:
>
> Hi Gilles,
>
> you catch the bug! With this patch, on a single node, the memory
> leak disappear. The cluster is actualy overloaded, as soon as
> possible I will launch a multinode test.
> Below the memory used by rank 0 before (blue) and after (red) the
> patch.
>
> Thanks
>
> Patrick
>
>
> Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
>> Patrick,
>>
>>
>> First, thank you very much for sharing the reproducer.
>>
>>
>> Yes, please open a github issue so we can track this.
>>
>>
>> I cannot fully understand where the leak is coming from, but so far
>>
>>  - the code fails on master built with --enable-debug (the data
>> engine reports an error) but not with the v3.1.x branch
>>
>>   (this suggests there could be an error in the latest Open MPI
>> ... or in the code)
>>
>>  - the attached patch seems to have a positive effect, can you
>> please give it a try?
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>>
>> On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
>>> Hi,
>>>
>>> I've written a small piece of code to show the problem. Based on
>>> my application but 2D and using integers arrays for testing.
>>> The  figure below shows the max RSS size of rank 0 process on
>>> 2 iterations on 8 and 16 cores, with openib and tcp drivers.
>>> The more processes I have, the larger the memory leak.  I use
>>> the same binaries for the 4 runs and OpenMPI 3.1 (same behavior
>>> with 4.0.5).
>>> The code is in attachment. I'll try to check type deallocation
>>> as soon as possible.
>>>
>>> Patrick
>>>
>>>
>>>
>>>
>>> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
 Patrick,


 based on George's idea, a simpler check is to retrieve the
 Fortran index via the (standard) MPI_Type_c2() function

 after you create a derived datatype.


 If the index keeps growing forever even after you
 MPI_Type_free(), then this clearly indicates a leak.

 Unfortunately, this simple test cannot be used to definitely
 rule out any memory leak.


 Note you can also

 mpirun --mca pml ob1 --mca btl tcp,self ...

 in order to force communications over TCP/IP and hence rule out
 any memory leak that could be triggered by your fast interconnect.



 In any case, a reproducer will greatly help us debugging this
 issue.


 Cheers,


 Gilles



 On 12/4/2020 7:20 AM, George Bosilca via users wrote:
> Patrick,
>
> I'm afraid there is no simple way to check this. The main
> reason being that OMPI use handles for MPI objects, and these
> handles are not tracked by the library, they are supposed to
> be provided by the user for each call. In your case, as you
> already called MPI_Type_free on the datatype, you cannot
> produce a valid handle.
>
> There might be a trick. If the datatype is manipulated with
> any Fortran MPI functions, then we convert the handle (which
> in fact is a pointer) to an index into a pointer array
> structure. Thus, the index will remain used, and can therefore
> be used to convert back into a valid datatype pointer, until
> OMPI completely releases the datatype. Look into
> the ompi_datatype_f_to_c_table table to see the datatypes that
> exist and get their pointers, and then use these pointers as
> arguments to ompi_datatype_dump() to see if any of these
> existing datatypes are the ones you define.
>
> George.
>
>
>
>
> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
> mailto:users@lists.open-mpi.org>
> 
> > wrote:
>
>     Hi,
>
>     I'm trying to solve a memory leak since my new
> implementation of
>     communications based on MPI_AllToAllW and
> MPI_type_Create_SubArray
>     calls.  Arrays of SubArray types are created/destroyed at
> each
>     time step and used for communications.
>
>     On my laptop the code runs fine (running for 

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Gilles Gouaillardet via users
Hi Patrick,

Glad to hear you are now able to move forward.

Please keep in mind this is not a fix but a temporary workaround.
At first glance, I did not spot any issue in the current code.
It turned out that the memory leak disappeared when doing things differently

Cheers,

Gilles

On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users <
users@lists.open-mpi.org> wrote:

> Hi Gilles,
>
> you catch the bug! With this patch, on a single node, the memory leak
> disappear. The cluster is actualy overloaded, as soon as possible I will
> launch a multinode test.
> Below the memory used by rank 0 before (blue) and after (red) the patch.
>
> Thanks
>
> Patrick
>
>
> Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
>
> Patrick,
>
>
> First, thank you very much for sharing the reproducer.
>
>
> Yes, please open a github issue so we can track this.
>
>
> I cannot fully understand where the leak is coming from, but so far
>
>  - the code fails on master built with --enable-debug (the data engine
> reports an error) but not with the v3.1.x branch
>
>   (this suggests there could be an error in the latest Open MPI ... or in
> the code)
>
>  - the attached patch seems to have a positive effect, can you please give
> it a try?
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
>
> Hi,
>
> I've written a small piece of code to show the problem. Based on my
> application but 2D and using integers arrays for testing.
> The  figure below shows the max RSS size of rank 0 process on 2
> iterations on 8 and 16 cores, with openib and tcp drivers.
> The more processes I have, the larger the memory leak.  I use the same
> binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
> The code is in attachment. I'll try to check type deallocation as soon as
> possible.
>
> Patrick
>
>
>
>
> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>
> Patrick,
>
>
> based on George's idea, a simpler check is to retrieve the Fortran index
> via the (standard) MPI_Type_c2() function
>
> after you create a derived datatype.
>
>
> If the index keeps growing forever even after you MPI_Type_free(), then
> this clearly indicates a leak.
>
> Unfortunately, this simple test cannot be used to definitely rule out any
> memory leak.
>
>
> Note you can also
>
> mpirun --mca pml ob1 --mca btl tcp,self ...
>
> in order to force communications over TCP/IP and hence rule out any memory
> leak that could be triggered by your fast interconnect.
>
>
>
> In any case, a reproducer will greatly help us debugging this issue.
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>
> Patrick,
>
> I'm afraid there is no simple way to check this. The main reason being
> that OMPI use handles for MPI objects, and these handles are not tracked by
> the library, they are supposed to be provided by the user for each call. In
> your case, as you already called MPI_Type_free on the datatype, you cannot
> produce a valid handle.
>
> There might be a trick. If the datatype is manipulated with any Fortran
> MPI functions, then we convert the handle (which in fact is a pointer) to
> an index into a pointer array structure. Thus, the index will remain used,
> and can therefore be used to convert back into a valid datatype pointer,
> until OMPI completely releases the datatype. Look into
> the ompi_datatype_f_to_c_table table to see the datatypes that exist and
> get their pointers, and then use these pointers as arguments to
> ompi_datatype_dump() to see if any of these existing datatypes are the ones
> you define.
>
> George.
>
>
>
>
> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users <
> users@lists.open-mpi.org 
> > wrote:
>
> Hi,
>
> I'm trying to solve a memory leak since my new implementation of
> communications based on MPI_AllToAllW and MPI_type_Create_SubArray
> calls.  Arrays of SubArray types are created/destroyed at each
> time step and used for communications.
>
> On my laptop the code runs fine (running for 15000 temporal
> itérations on 32 processes with oversubscription) but on our
> cluster memory used by the code increase until the OOMkiller stop
> the job. On the cluster we use IB QDR for communications.
>
> Same Gcc/Gfortran 7.3 (built from sources), same sources of
> OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
> the laptop and on the cluster.
>
> Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
> show the problem (resident memory do not increase and we ran
> 10 temporal iterations)
>
> MPI_type_free manual says that it "/Marks the datatype object
> associated with datatype for deallocation/". But  how can I check
> that the deallocation is really done ?
>
> Thanks for ant suggestions.
>
> Patrick
>
>
>
>


Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Patrick Bégou via users
Hi Gilles,

you catch the bug! With this patch, on a single node, the memory leak
disappear. The cluster is actualy overloaded, as soon as possible I will
launch a multinode test.
Below the memory used by rank 0 before (blue) and after (red) the patch.

Thanks

Patrick


Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
> Patrick,
>
>
> First, thank you very much for sharing the reproducer.
>
>
> Yes, please open a github issue so we can track this.
>
>
> I cannot fully understand where the leak is coming from, but so far
>
>  - the code fails on master built with --enable-debug (the data engine
> reports an error) but not with the v3.1.x branch
>
>   (this suggests there could be an error in the latest Open MPI ... or
> in the code)
>
>  - the attached patch seems to have a positive effect, can you please
> give it a try?
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
>> Hi,
>>
>> I've written a small piece of code to show the problem. Based on my
>> application but 2D and using integers arrays for testing.
>> The  figure below shows the max RSS size of rank 0 process on 2
>> iterations on 8 and 16 cores, with openib and tcp drivers.
>> The more processes I have, the larger the memory leak.  I use the
>> same binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
>> The code is in attachment. I'll try to check type deallocation as
>> soon as possible.
>>
>> Patrick
>>
>>
>>
>>
>> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>>> Patrick,
>>>
>>>
>>> based on George's idea, a simpler check is to retrieve the Fortran
>>> index via the (standard) MPI_Type_c2() function
>>>
>>> after you create a derived datatype.
>>>
>>>
>>> If the index keeps growing forever even after you MPI_Type_free(),
>>> then this clearly indicates a leak.
>>>
>>> Unfortunately, this simple test cannot be used to definitely rule
>>> out any memory leak.
>>>
>>>
>>> Note you can also
>>>
>>> mpirun --mca pml ob1 --mca btl tcp,self ...
>>>
>>> in order to force communications over TCP/IP and hence rule out any
>>> memory leak that could be triggered by your fast interconnect.
>>>
>>>
>>>
>>> In any case, a reproducer will greatly help us debugging this issue.
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>>
>>> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
 Patrick,

 I'm afraid there is no simple way to check this. The main reason
 being that OMPI use handles for MPI objects, and these handles are
 not tracked by the library, they are supposed to be provided by the
 user for each call. In your case, as you already called
 MPI_Type_free on the datatype, you cannot produce a valid handle.

 There might be a trick. If the datatype is manipulated with any
 Fortran MPI functions, then we convert the handle (which in fact is
 a pointer) to an index into a pointer array structure. Thus, the
 index will remain used, and can therefore be used to convert back
 into a valid datatype pointer, until OMPI completely releases the
 datatype. Look into the ompi_datatype_f_to_c_table table to see the
 datatypes that exist and get their pointers, and then use these
 pointers as arguments to ompi_datatype_dump() to see if any of
 these existing datatypes are the ones you define.

 George.




 On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
 mailto:users@lists.open-mpi.org>> wrote:

     Hi,

     I'm trying to solve a memory leak since my new implementation of
     communications based on MPI_AllToAllW and MPI_type_Create_SubArray
     calls.  Arrays of SubArray types are created/destroyed at each
     time step and used for communications.

     On my laptop the code runs fine (running for 15000 temporal
     itérations on 32 processes with oversubscription) but on our
     cluster memory used by the code increase until the OOMkiller stop
     the job. On the cluster we use IB QDR for communications.

     Same Gcc/Gfortran 7.3 (built from sources), same sources of
     OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
     the laptop and on the cluster.

     Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
     show the problem (resident memory do not increase and we ran
     10 temporal iterations)

     MPI_type_free manual says that it "/Marks the datatype object
     associated with datatype for deallocation/". But  how can I check
     that the deallocation is really done ?

     Thanks for ant suggestions.

     Patrick

>>



Re: [OMPI users] MPI_type_free question

2020-12-10 Thread Gilles Gouaillardet via users

Patrick,


First, thank you very much for sharing the reproducer.


Yes, please open a github issue so we can track this.


I cannot fully understand where the leak is coming from, but so far

 - the code fails on master built with --enable-debug (the data engine 
reports an error) but not with the v3.1.x branch


  (this suggests there could be an error in the latest Open MPI ... or 
in the code)


 - the attached patch seems to have a positive effect, can you please 
give it a try?



Cheers,


Gilles



On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:

Hi,

I've written a small piece of code to show the problem. Based on my 
application but 2D and using integers arrays for testing.
The  figure below shows the max RSS size of rank 0 process on 2 
iterations on 8 and 16 cores, with openib and tcp drivers.
The more processes I have, the larger the memory leak.  I use the same 
binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
The code is in attachment. I'll try to check type deallocation as soon 
as possible.


Patrick




Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :

Patrick,


based on George's idea, a simpler check is to retrieve the Fortran 
index via the (standard) MPI_Type_c2() function


after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), 
then this clearly indicates a leak.


Unfortunately, this simple test cannot be used to definitely rule out 
any memory leak.



Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any 
memory leak that could be triggered by your fast interconnect.




In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:

Patrick,

I'm afraid there is no simple way to check this. The main reason 
being that OMPI use handles for MPI objects, and these handles are 
not tracked by the library, they are supposed to be provided by the 
user for each call. In your case, as you already called 
MPI_Type_free on the datatype, you cannot produce a valid handle.


There might be a trick. If the datatype is manipulated with any 
Fortran MPI functions, then we convert the handle (which in fact is 
a pointer) to an index into a pointer array structure. Thus, the 
index will remain used, and can therefore be used to convert back 
into a valid datatype pointer, until OMPI completely releases the 
datatype. Look into the ompi_datatype_f_to_c_table table to see the 
datatypes that exist and get their pointers, and then use these 
pointers as arguments to ompi_datatype_dump() to see if any of these 
existing datatypes are the ones you define.


George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:


    Hi,

    I'm trying to solve a memory leak since my new implementation of
    communications based on MPI_AllToAllW and MPI_type_Create_SubArray
    calls.  Arrays of SubArray types are created/destroyed at each
    time step and used for communications.

    On my laptop the code runs fine (running for 15000 temporal
    itérations on 32 processes with oversubscription) but on our
    cluster memory used by the code increase until the OOMkiller stop
    the job. On the cluster we use IB QDR for communications.

    Same Gcc/Gfortran 7.3 (built from sources), same sources of
    OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
    the laptop and on the cluster.

    Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
    show the problem (resident memory do not increase and we ran
    10 temporal iterations)

    MPI_type_free manual says that it "/Marks the datatype object
    associated with datatype for deallocation/". But  how can I check
    that the deallocation is really done ?

    Thanks for ant suggestions.

    Patrick



diff --git a/ompi/mca/coll/basic/coll_basic_alltoallw.c 
b/ompi/mca/coll/basic/coll_basic_alltoallw.c
index 93fa880..5aca2c2 100644
--- a/ompi/mca/coll/basic/coll_basic_alltoallw.c
+++ b/ompi/mca/coll/basic/coll_basic_alltoallw.c
@@ -194,7 +194,7 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const int 
*scounts, const int *
 continue;
 
 prcv = ((char *) rbuf) + rdisps[i];
-err = MCA_PML_CALL(irecv_init(prcv, rcounts[i], rdtypes[i],
+err = MCA_PML_CALL(irecv(prcv, rcounts[i], rdtypes[i],
   i, MCA_COLL_BASE_TAG_ALLTOALLW, comm,
   preq++));
 ++nreqs;
@@ -215,21 +215,15 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const 
int *scounts, const int *
 continue;
 
 psnd = ((char *) sbuf) + sdisps[i];
-err = MCA_PML_CALL(isend_init(psnd, scounts[i], sdtypes[i],
+err = MCA_PML_CALL(send(psnd, scounts[i], sdtypes[i],
   i, 

Re: [OMPI users] MPI_type_free question

2020-12-10 Thread Patrick Bégou via users
Hi OpenMPI developers,

it looks difficult for me to track this memory problem in OpenMPI 3.x
and 4.x implementation Should I open an issue about this ?
Or is openib definitively an old strategy that will not evolved (and bug
get untracked) ?

Thanks

Patrick



Le 07/12/2020 à 10:15, Patrick Bégou via users a écrit :
> Hi,
>
> I've written a small piece of code to show the problem. Based on my
> application but 2D and using integers arrays for testing.
> The  figure below shows the max RSS size of rank 0 process on 2
> iterations on 8 and 16 cores, with openib and tcp drivers.
> The more processes I have, the larger the memory leak.  I use the same
> binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
> The code is in attachment. I'll try to check type deallocation as soon
> as possible.
>
> Patrick
>
>
>
>
> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>> Patrick,
>>
>>
>> based on George's idea, a simpler check is to retrieve the Fortran
>> index via the (standard) MPI_Type_c2() function
>>
>> after you create a derived datatype.
>>
>>
>> If the index keeps growing forever even after you MPI_Type_free(),
>> then this clearly indicates a leak.
>>
>> Unfortunately, this simple test cannot be used to definitely rule out
>> any memory leak.
>>
>>
>> Note you can also
>>
>> mpirun --mca pml ob1 --mca btl tcp,self ...
>>
>> in order to force communications over TCP/IP and hence rule out any
>> memory leak that could be triggered by your fast interconnect.
>>
>>
>>
>> In any case, a reproducer will greatly help us debugging this issue.
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>>
>> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>>> Patrick,
>>>
>>> I'm afraid there is no simple way to check this. The main reason
>>> being that OMPI use handles for MPI objects, and these handles are
>>> not tracked by the library, they are supposed to be provided by the
>>> user for each call. In your case, as you already called
>>> MPI_Type_free on the datatype, you cannot produce a valid handle.
>>>
>>> There might be a trick. If the datatype is manipulated with any
>>> Fortran MPI functions, then we convert the handle (which in fact is
>>> a pointer) to an index into a pointer array structure. Thus, the
>>> index will remain used, and can therefore be used to convert back
>>> into a valid datatype pointer, until OMPI completely releases the
>>> datatype. Look into the ompi_datatype_f_to_c_table table to see the
>>> datatypes that exist and get their pointers, and then use these
>>> pointers as arguments to ompi_datatype_dump() to see if any of these
>>> existing datatypes are the ones you define.
>>>
>>> George.
>>>
>>>
>>>
>>>
>>> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
>>> mailto:users@lists.open-mpi.org>> wrote:
>>>
>>>     Hi,
>>>
>>>     I'm trying to solve a memory leak since my new implementation of
>>>     communications based on MPI_AllToAllW and MPI_type_Create_SubArray
>>>     calls.  Arrays of SubArray types are created/destroyed at each
>>>     time step and used for communications.
>>>
>>>     On my laptop the code runs fine (running for 15000 temporal
>>>     itérations on 32 processes with oversubscription) but on our
>>>     cluster memory used by the code increase until the OOMkiller stop
>>>     the job. On the cluster we use IB QDR for communications.
>>>
>>>     Same Gcc/Gfortran 7.3 (built from sources), same sources of
>>>     OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
>>>     the laptop and on the cluster.
>>>
>>>     Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
>>>     show the problem (resident memory do not increase and we ran
>>>     10 temporal iterations)
>>>
>>>     MPI_type_free manual says that it "/Marks the datatype object
>>>     associated with datatype for deallocation/". But  how can I check
>>>     that the deallocation is really done ?
>>>
>>>     Thanks for ant suggestions.
>>>
>>>     Patrick
>>>
>



Re: [OMPI users] MPI_type_free question

2020-12-07 Thread Patrick Bégou via users
Hi George,

I've implemented a call to MPI_Type_f2c using fortran C_BINDING and it
works . Data types are allways set as deallocated (I've checked the
reverse by commenting the calls to MPI_type_free(...) to be sure that it
reports "Not deallocated" in my code in this case.

Then I've ran the code with tcp and openib drivers but keeping the
deallocation commented to see how the memory consumption evolves:

The global slope of the curves are quite similar in tcp and openip on
1000 iterations even if they look differents. So it looks really as a
subarray type deallocation problem but deeper in the code I think.

Patrick




Le 04/12/2020 à 19:20, George Bosilca a écrit :
> On Fri, Dec 4, 2020 at 2:33 AM Patrick Bégou via users
> mailto:users@lists.open-mpi.org>> wrote:
>
> Hi George and Gilles,
>
> Thanks George for your suggestion. Is it valuable for 4.05 and 3.1
> OpenMPI Versions ? I will have a look today at these tables. May
> be writing a small piece of code juste creating and freeing
> subarray datatype.
>
>
> Patrick,
>
> If you use Gilles' suggestion to go through the type_f2c function when
> listing the datatypes should give you a portable datatype iterator
> across all versions of OMPI. The call to dump a datatype content,
> ompi_datatype_dump, has been there for a very long time, so the
> combination of the two should work everywhere.
>
> Thinking a little more about this, you don't necessarily have to dump
> the content of the datatype, you only need to check if they are
> different from MPI_DATATYPE_NULL. Thus, you can have a solution using
> only the MPI API.
>
>   George.
>  
>
>
> Thanks Gilles for suggesting disabling the interconnect. it is a
> good fast test and yes, *with "mpirun --mca pml ob1 --mca btl
> tcp,self" I have no memory leak*. So this explain the differences
> between my laptop and the cluster.
> The implementation of type management is so different from 1.7.3  ?
>
> A PhD student tells me he has also some trouble with this code on
> a cluster Omnipath based. I will have to investigate too but not
> sure it is the same problem.
>
> Patrick
>
> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>> Patrick,
>>
>>
>> based on George's idea, a simpler check is to retrieve the
>> Fortran index via the (standard) MPI_Type_c2() function
>>
>> after you create a derived datatype.
>>
>>
>> If the index keeps growing forever even after you
>> MPI_Type_free(), then this clearly indicates a leak.
>>
>> Unfortunately, this simple test cannot be used to definitely rule
>> out any memory leak.
>>
>>
>> Note you can also
>>
>> mpirun --mca pml ob1 --mca btl tcp,self ...
>>
>> in order to force communications over TCP/IP and hence rule out
>> any memory leak that could be triggered by your fast interconnect.
>>
>>
>>
>> In any case, a reproducer will greatly help us debugging this issue.
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>>
>> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>>> Patrick,
>>>
>>> I'm afraid there is no simple way to check this. The main reason
>>> being that OMPI use handles for MPI objects, and these handles
>>> are not tracked by the library, they are supposed to be provided
>>> by the user for each call. In your case, as you already called
>>> MPI_Type_free on the datatype, you cannot produce a valid handle.
>>>
>>> There might be a trick. If the datatype is manipulated with any
>>> Fortran MPI functions, then we convert the handle (which in fact
>>> is a pointer) to an index into a pointer array structure. Thus,
>>> the index will remain used, and can therefore be used to convert
>>> back into a valid datatype pointer, until OMPI completely
>>> releases the datatype. Look into the ompi_datatype_f_to_c_table
>>> table to see the datatypes that exist and get their pointers,
>>> and then use these pointers as arguments to ompi_datatype_dump()
>>> to see if any of these existing datatypes are the ones you define.
>>>
>>> George.
>>>
>>>
>>>
>>>
>>> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
>>> mailto:users@lists.open-mpi.org>
>>> 
>>> > wrote:
>>>
>>>     Hi,
>>>
>>>     I'm trying to solve a memory leak since my new
>>> implementation of
>>>     communications based on MPI_AllToAllW and
>>> MPI_type_Create_SubArray
>>>     calls.  Arrays of SubArray types are created/destroyed at each
>>>     time step and used for communications.
>>>
>>>     On my laptop the code runs fine (running for 15000 temporal
>>>     itérations on 32 processes with oversubscription) but on our
>>>     cluster memory used by the code increase until the OOMkiller
>>> stop
>>>     the job. On the cluster we use IB QDR for communications.
>>>
>>>     

Re: [OMPI users] MPI_type_free question

2020-12-07 Thread Patrick Bégou via users
Hi,

I've written a small piece of code to show the problem. Based on my
application but 2D and using integers arrays for testing.
The  figure below shows the max RSS size of rank 0 process on 2
iterations on 8 and 16 cores, with openib and tcp drivers.
The more processes I have, the larger the memory leak.  I use the same
binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
The code is in attachment. I'll try to check type deallocation as soon
as possible.

Patrick




Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
> Patrick,
>
>
> based on George's idea, a simpler check is to retrieve the Fortran
> index via the (standard) MPI_Type_c2() function
>
> after you create a derived datatype.
>
>
> If the index keeps growing forever even after you MPI_Type_free(),
> then this clearly indicates a leak.
>
> Unfortunately, this simple test cannot be used to definitely rule out
> any memory leak.
>
>
> Note you can also
>
> mpirun --mca pml ob1 --mca btl tcp,self ...
>
> in order to force communications over TCP/IP and hence rule out any
> memory leak that could be triggered by your fast interconnect.
>
>
>
> In any case, a reproducer will greatly help us debugging this issue.
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>> Patrick,
>>
>> I'm afraid there is no simple way to check this. The main reason
>> being that OMPI use handles for MPI objects, and these handles are
>> not tracked by the library, they are supposed to be provided by the
>> user for each call. In your case, as you already called MPI_Type_free
>> on the datatype, you cannot produce a valid handle.
>>
>> There might be a trick. If the datatype is manipulated with any
>> Fortran MPI functions, then we convert the handle (which in fact is a
>> pointer) to an index into a pointer array structure. Thus, the index
>> will remain used, and can therefore be used to convert back into a
>> valid datatype pointer, until OMPI completely releases the datatype.
>> Look into the ompi_datatype_f_to_c_table table to see the datatypes
>> that exist and get their pointers, and then use these pointers as
>> arguments to ompi_datatype_dump() to see if any of these existing
>> datatypes are the ones you define.
>>
>> George.
>>
>>
>>
>>
>> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
>> mailto:users@lists.open-mpi.org>> wrote:
>>
>>     Hi,
>>
>>     I'm trying to solve a memory leak since my new implementation of
>>     communications based on MPI_AllToAllW and MPI_type_Create_SubArray
>>     calls.  Arrays of SubArray types are created/destroyed at each
>>     time step and used for communications.
>>
>>     On my laptop the code runs fine (running for 15000 temporal
>>     itérations on 32 processes with oversubscription) but on our
>>     cluster memory used by the code increase until the OOMkiller stop
>>     the job. On the cluster we use IB QDR for communications.
>>
>>     Same Gcc/Gfortran 7.3 (built from sources), same sources of
>>     OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
>>     the laptop and on the cluster.
>>
>>     Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
>>     show the problem (resident memory do not increase and we ran
>>     10 temporal iterations)
>>
>>     MPI_type_free manual says that it "/Marks the datatype object
>>     associated with datatype for deallocation/". But  how can I check
>>     that the deallocation is really done ?
>>
>>     Thanks for ant suggestions.
>>
>>     Patrick
>>



test_layout_array.tgz
Description: application/compressed-tar


Re: [OMPI users] MPI_type_free question

2020-12-04 Thread George Bosilca via users
On Fri, Dec 4, 2020 at 2:33 AM Patrick Bégou via users <
users@lists.open-mpi.org> wrote:

> Hi George and Gilles,
>
> Thanks George for your suggestion. Is it valuable for 4.05 and 3.1 OpenMPI
> Versions ? I will have a look today at these tables. May be writing a small
> piece of code juste creating and freeing subarray datatype.
>

Patrick,

If you use Gilles' suggestion to go through the type_f2c function when
listing the datatypes should give you a portable datatype iterator across
all versions of OMPI. The call to dump a datatype content,
ompi_datatype_dump, has been there for a very long time, so the combination
of the two should work everywhere.

Thinking a little more about this, you don't necessarily have to dump the
content of the datatype, you only need to check if they are different from
MPI_DATATYPE_NULL. Thus, you can have a solution using only the MPI API.

  George.


>
> Thanks Gilles for suggesting disabling the interconnect. it is a good fast
> test and yes, *with "mpirun --mca pml ob1 --mca btl tcp,self" I have no
> memory leak*. So this explain the differences between my laptop and the
> cluster.
> The implementation of type management is so different from 1.7.3  ?
>
> A PhD student tells me he has also some trouble with this code on a
> cluster Omnipath based. I will have to investigate too but not sure it is
> the same problem.
>
> Patrick
>
> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>
> Patrick,
>
>
> based on George's idea, a simpler check is to retrieve the Fortran index
> via the (standard) MPI_Type_c2() function
>
> after you create a derived datatype.
>
>
> If the index keeps growing forever even after you MPI_Type_free(), then
> this clearly indicates a leak.
>
> Unfortunately, this simple test cannot be used to definitely rule out any
> memory leak.
>
>
> Note you can also
>
> mpirun --mca pml ob1 --mca btl tcp,self ...
>
> in order to force communications over TCP/IP and hence rule out any memory
> leak that could be triggered by your fast interconnect.
>
>
>
> In any case, a reproducer will greatly help us debugging this issue.
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>
> Patrick,
>
> I'm afraid there is no simple way to check this. The main reason being
> that OMPI use handles for MPI objects, and these handles are not tracked by
> the library, they are supposed to be provided by the user for each call. In
> your case, as you already called MPI_Type_free on the datatype, you cannot
> produce a valid handle.
>
> There might be a trick. If the datatype is manipulated with any Fortran
> MPI functions, then we convert the handle (which in fact is a pointer) to
> an index into a pointer array structure. Thus, the index will remain used,
> and can therefore be used to convert back into a valid datatype pointer,
> until OMPI completely releases the datatype. Look into
> the ompi_datatype_f_to_c_table table to see the datatypes that exist and
> get their pointers, and then use these pointers as arguments to
> ompi_datatype_dump() to see if any of these existing datatypes are the ones
> you define.
>
> George.
>
>
>
>
> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users <
> users@lists.open-mpi.org 
> > wrote:
>
> Hi,
>
> I'm trying to solve a memory leak since my new implementation of
> communications based on MPI_AllToAllW and MPI_type_Create_SubArray
> calls.  Arrays of SubArray types are created/destroyed at each
> time step and used for communications.
>
> On my laptop the code runs fine (running for 15000 temporal
> itérations on 32 processes with oversubscription) but on our
> cluster memory used by the code increase until the OOMkiller stop
> the job. On the cluster we use IB QDR for communications.
>
> Same Gcc/Gfortran 7.3 (built from sources), same sources of
> OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
> the laptop and on the cluster.
>
> Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
> show the problem (resident memory do not increase and we ran
> 10 temporal iterations)
>
> MPI_type_free manual says that it "/Marks the datatype object
> associated with datatype for deallocation/". But  how can I check
> that the deallocation is really done ?
>
> Thanks for ant suggestions.
>
> Patrick
>
>
>


Re: [OMPI users] MPI_type_free question

2020-12-04 Thread Patrick Bégou via users
Hi Gilles,

The interconnect is Qlogic infiniband QDR. I was unable to compile
latest UCX release on this CentOS6 based Rocks Cluster distribution when
deploying OpenMPI 4.0.5 but I do not search a lot: I'm currently
deploying a new cluster on CentOS8 and this old cluster will move to
CentOS8 too as soon as the new one will be in production (using gcc10,
OpenMPI 4.0.5 and UCX).

In attachment the ompi_info for OpenMPI 3.1 (version in production)  and
the dump requested on the kareline cluster using 16 processes.

Patrick

Le 04/12/2020 à 08:57, Gilles Gouaillardet via users a écrit :
> Patrick,
>
>
> the test points to a leak in the way the interconnect component
> (pml/ucx ? pml/cm? mtl/psm2? btl/openib?) handles the datatype rather
> than the datatype engine itself.
>
>
> What interconnect is available on your cluster and which component(s)
> are used?
>
>
> mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 --mca
> btl_base_verbose 10 ...
>
> will point you to the component(s) used.
>
> The output is pretty verbose, so feel free to compress and post it if
> you cannot decipher it
>
>
> Cheers,
>
>
> Gilles
>
> On 12/4/2020 4:32 PM, Patrick Bégou via users wrote:
>> Hi George and Gilles,
>>
>> Thanks George for your suggestion. Is it valuable for 4.05 and 3.1
>> OpenMPI Versions ? I will have a look today at these tables. May be
>> writing a small piece of code juste creating and freeing subarray
>> datatype.
>>
>> Thanks Gilles for suggesting disabling the interconnect. it is a good
>> fast test and yes, *with "mpirun --mca pml ob1 --mca btl tcp,self" I
>> have no memory leak*. So this explain the differences between my
>> laptop and the cluster.
>> The implementation of type management is so different from 1.7.3  ?
>>
>> A PhD student tells me he has also some trouble with this code on a
>> cluster Omnipath based. I will have to investigate too but not sure
>> it is the same problem.
>>
>> Patrick
>>
>> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>>> Patrick,
>>>
>>>
>>> based on George's idea, a simpler check is to retrieve the Fortran
>>> index via the (standard) MPI_Type_c2() function
>>>
>>> after you create a derived datatype.
>>>
>>>
>>> If the index keeps growing forever even after you MPI_Type_free(),
>>> then this clearly indicates a leak.
>>>
>>> Unfortunately, this simple test cannot be used to definitely rule
>>> out any memory leak.
>>>
>>>
>>> Note you can also
>>>
>>> mpirun --mca pml ob1 --mca btl tcp,self ...
>>>
>>> in order to force communications over TCP/IP and hence rule out any
>>> memory leak that could be triggered by your fast interconnect.
>>>
>>>
>>>
>>> In any case, a reproducer will greatly help us debugging this issue.
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>>
>>> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
 Patrick,

 I'm afraid there is no simple way to check this. The main reason
 being that OMPI use handles for MPI objects, and these handles are
 not tracked by the library, they are supposed to be provided by the
 user for each call. In your case, as you already called
 MPI_Type_free on the datatype, you cannot produce a valid handle.

 There might be a trick. If the datatype is manipulated with any
 Fortran MPI functions, then we convert the handle (which in fact is
 a pointer) to an index into a pointer array structure. Thus, the
 index will remain used, and can therefore be used to convert back
 into a valid datatype pointer, until OMPI completely releases the
 datatype. Look into the ompi_datatype_f_to_c_table table to see the
 datatypes that exist and get their pointers, and then use these
 pointers as arguments to ompi_datatype_dump() to see if any of
 these existing datatypes are the ones you define.

 George.




 On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
 mailto:users@lists.open-mpi.org>> wrote:

     Hi,

     I'm trying to solve a memory leak since my new implementation of
     communications based on MPI_AllToAllW and MPI_type_Create_SubArray
     calls.  Arrays of SubArray types are created/destroyed at each
     time step and used for communications.

     On my laptop the code runs fine (running for 15000 temporal
     itérations on 32 processes with oversubscription) but on our
     cluster memory used by the code increase until the OOMkiller stop
     the job. On the cluster we use IB QDR for communications.

     Same Gcc/Gfortran 7.3 (built from sources), same sources of
     OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
     the laptop and on the cluster.

     Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
     show the problem (resident memory do not increase and we ran
     10 temporal iterations)

     MPI_type_free manual says that it "/Marks the datatype 

Re: [OMPI users] MPI_type_free question

2020-12-04 Thread Gilles Gouaillardet via users

Patrick,


the test points to a leak in the way the interconnect component (pml/ucx 
? pml/cm? mtl/psm2? btl/openib?) handles the datatype rather than the 
datatype engine itself.



What interconnect is available on your cluster and which component(s) 
are used?



mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 --mca 
btl_base_verbose 10 ...


will point you to the component(s) used.

The output is pretty verbose, so feel free to compress and post it if 
you cannot decipher it



Cheers,


Gilles

On 12/4/2020 4:32 PM, Patrick Bégou via users wrote:

Hi George and Gilles,

Thanks George for your suggestion. Is it valuable for 4.05 and 3.1 
OpenMPI Versions ? I will have a look today at these tables. May be 
writing a small piece of code juste creating and freeing subarray 
datatype.


Thanks Gilles for suggesting disabling the interconnect. it is a good 
fast test and yes, *with "mpirun --mca pml ob1 --mca btl tcp,self" I 
have no memory leak*. So this explain the differences between my 
laptop and the cluster.

The implementation of type management is so different from 1.7.3  ?

A PhD student tells me he has also some trouble with this code on a 
cluster Omnipath based. I will have to investigate too but not sure it 
is the same problem.


Patrick

Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :

Patrick,


based on George's idea, a simpler check is to retrieve the Fortran 
index via the (standard) MPI_Type_c2() function


after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), 
then this clearly indicates a leak.


Unfortunately, this simple test cannot be used to definitely rule out 
any memory leak.



Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any 
memory leak that could be triggered by your fast interconnect.




In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:

Patrick,

I'm afraid there is no simple way to check this. The main reason 
being that OMPI use handles for MPI objects, and these handles are 
not tracked by the library, they are supposed to be provided by the 
user for each call. In your case, as you already called 
MPI_Type_free on the datatype, you cannot produce a valid handle.


There might be a trick. If the datatype is manipulated with any 
Fortran MPI functions, then we convert the handle (which in fact is 
a pointer) to an index into a pointer array structure. Thus, the 
index will remain used, and can therefore be used to convert back 
into a valid datatype pointer, until OMPI completely releases the 
datatype. Look into the ompi_datatype_f_to_c_table table to see the 
datatypes that exist and get their pointers, and then use these 
pointers as arguments to ompi_datatype_dump() to see if any of these 
existing datatypes are the ones you define.


George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:


    Hi,

    I'm trying to solve a memory leak since my new implementation of
    communications based on MPI_AllToAllW and MPI_type_Create_SubArray
    calls.  Arrays of SubArray types are created/destroyed at each
    time step and used for communications.

    On my laptop the code runs fine (running for 15000 temporal
    itérations on 32 processes with oversubscription) but on our
    cluster memory used by the code increase until the OOMkiller stop
    the job. On the cluster we use IB QDR for communications.

    Same Gcc/Gfortran 7.3 (built from sources), same sources of
    OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
    the laptop and on the cluster.

    Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
    show the problem (resident memory do not increase and we ran
    10 temporal iterations)

    MPI_type_free manual says that it "/Marks the datatype object
    associated with datatype for deallocation/". But  how can I check
    that the deallocation is really done ?

    Thanks for ant suggestions.

    Patrick





Re: [OMPI users] MPI_type_free question

2020-12-03 Thread Patrick Bégou via users
Hi George and Gilles,

Thanks George for your suggestion. Is it valuable for 4.05 and 3.1
OpenMPI Versions ? I will have a look today at these tables. May be
writing a small piece of code juste creating and freeing subarray datatype.

Thanks Gilles for suggesting disabling the interconnect. it is a good
fast test and yes, *with "mpirun --mca pml ob1 --mca btl tcp,self" I
have no memory leak*. So this explain the differences between my laptop
and the cluster.
The implementation of type management is so different from 1.7.3  ?

A PhD student tells me he has also some trouble with this code on a
cluster Omnipath based. I will have to investigate too but not sure it
is the same problem.

Patrick

Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
> Patrick,
>
>
> based on George's idea, a simpler check is to retrieve the Fortran
> index via the (standard) MPI_Type_c2() function
>
> after you create a derived datatype.
>
>
> If the index keeps growing forever even after you MPI_Type_free(),
> then this clearly indicates a leak.
>
> Unfortunately, this simple test cannot be used to definitely rule out
> any memory leak.
>
>
> Note you can also
>
> mpirun --mca pml ob1 --mca btl tcp,self ...
>
> in order to force communications over TCP/IP and hence rule out any
> memory leak that could be triggered by your fast interconnect.
>
>
>
> In any case, a reproducer will greatly help us debugging this issue.
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>> Patrick,
>>
>> I'm afraid there is no simple way to check this. The main reason
>> being that OMPI use handles for MPI objects, and these handles are
>> not tracked by the library, they are supposed to be provided by the
>> user for each call. In your case, as you already called MPI_Type_free
>> on the datatype, you cannot produce a valid handle.
>>
>> There might be a trick. If the datatype is manipulated with any
>> Fortran MPI functions, then we convert the handle (which in fact is a
>> pointer) to an index into a pointer array structure. Thus, the index
>> will remain used, and can therefore be used to convert back into a
>> valid datatype pointer, until OMPI completely releases the datatype.
>> Look into the ompi_datatype_f_to_c_table table to see the datatypes
>> that exist and get their pointers, and then use these pointers as
>> arguments to ompi_datatype_dump() to see if any of these existing
>> datatypes are the ones you define.
>>
>> George.
>>
>>
>>
>>
>> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
>> mailto:users@lists.open-mpi.org>> wrote:
>>
>>     Hi,
>>
>>     I'm trying to solve a memory leak since my new implementation of
>>     communications based on MPI_AllToAllW and MPI_type_Create_SubArray
>>     calls.  Arrays of SubArray types are created/destroyed at each
>>     time step and used for communications.
>>
>>     On my laptop the code runs fine (running for 15000 temporal
>>     itérations on 32 processes with oversubscription) but on our
>>     cluster memory used by the code increase until the OOMkiller stop
>>     the job. On the cluster we use IB QDR for communications.
>>
>>     Same Gcc/Gfortran 7.3 (built from sources), same sources of
>>     OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
>>     the laptop and on the cluster.
>>
>>     Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
>>     show the problem (resident memory do not increase and we ran
>>     10 temporal iterations)
>>
>>     MPI_type_free manual says that it "/Marks the datatype object
>>     associated with datatype for deallocation/". But  how can I check
>>     that the deallocation is really done ?
>>
>>     Thanks for ant suggestions.
>>
>>     Patrick
>>



Re: [OMPI users] MPI_type_free question

2020-12-03 Thread Gilles Gouaillardet via users

Patrick,


based on George's idea, a simpler check is to retrieve the Fortran index 
via the (standard) MPI_Type_c2() function


after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), then 
this clearly indicates a leak.


Unfortunately, this simple test cannot be used to definitely rule out 
any memory leak.



Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any 
memory leak that could be triggered by your fast interconnect.




In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:

Patrick,

I'm afraid there is no simple way to check this. The main reason being 
that OMPI use handles for MPI objects, and these handles are not 
tracked by the library, they are supposed to be provided by the user 
for each call. In your case, as you already called MPI_Type_free on 
the datatype, you cannot produce a valid handle.


There might be a trick. If the datatype is manipulated with any 
Fortran MPI functions, then we convert the handle (which in fact is a 
pointer) to an index into a pointer array structure. Thus, the index 
will remain used, and can therefore be used to convert back into a 
valid datatype pointer, until OMPI completely releases the datatype. 
Look into the ompi_datatype_f_to_c_table table to see the datatypes 
that exist and get their pointers, and then use these pointers as 
arguments to ompi_datatype_dump() to see if any of these existing 
datatypes are the ones you define.


George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:


Hi,

I'm trying to solve a memory leak since my new implementation of
communications based on MPI_AllToAllW and MPI_type_Create_SubArray
calls.  Arrays of SubArray types are created/destroyed at each
time step and used for communications.

On my laptop the code runs fine (running for 15000 temporal
itérations on 32 processes with oversubscription) but on our
cluster memory used by the code increase until the OOMkiller stop
the job. On the cluster we use IB QDR for communications.

Same Gcc/Gfortran 7.3 (built from sources), same sources of
OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
the laptop and on the cluster.

Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
show the problem (resident memory do not increase and we ran
10 temporal iterations)

MPI_type_free manual says that it "/Marks the datatype object
associated with datatype for deallocation/". But  how can I check
that the deallocation is really done ?

Thanks for ant suggestions.

Patrick



Re: [OMPI users] MPI_type_free question

2020-12-03 Thread George Bosilca via users
Patrick,

I'm afraid there is no simple way to check this. The main reason being that
OMPI use handles for MPI objects, and these handles are not tracked by the
library, they are supposed to be provided by the user for each call. In
your case, as you already called MPI_Type_free on the datatype, you cannot
produce a valid handle.

There might be a trick. If the datatype is manipulated with any Fortran MPI
functions, then we convert the handle (which in fact is a pointer) to an
index into a pointer array structure. Thus, the index will remain used, and
can therefore be used to convert back into a valid datatype pointer, until
OMPI completely releases the datatype. Look into
the ompi_datatype_f_to_c_table table to see the datatypes that exist and
get their pointers, and then use these pointers as arguments to
ompi_datatype_dump() to see if any of these existing datatypes are the ones
you define.

George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
> I'm trying to solve a memory leak since my new implementation of
> communications based on MPI_AllToAllW and MPI_type_Create_SubArray calls.
> Arrays of SubArray types are created/destroyed at each time step and used
> for communications.
>
> On my laptop the code runs fine (running for 15000 temporal itérations on
> 32 processes with oversubscription) but on our cluster memory used by the
> code increase until the OOMkiller stop the job. On the cluster we use IB
> QDR for communications.
>
> Same Gcc/Gfortran 7.3 (built from sources), same sources of OpenMPI (3.1
> or 4.0.5 tested), same sources of the fortran code on the laptop and on the
> cluster.
>
> Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not show the
> problem (resident memory do not increase and we ran 10 temporal
> iterations)
>
> MPI_type_free manual says that it "*Marks the datatype object associated
> with datatype for deallocation*". But  how can I check that the
> deallocation is really done ?
>
> Thanks for ant suggestions.
>
> Patrick
>


[OMPI users] MPI_type_free question

2020-12-03 Thread Patrick Bégou via users
Hi,

I'm trying to solve a memory leak since my new implementation of
communications based on MPI_AllToAllW and MPI_type_Create_SubArray
calls.  Arrays of SubArray types are created/destroyed at each time step
and used for communications.

On my laptop the code runs fine (running for 15000 temporal itérations
on 32 processes with oversubscription) but on our cluster memory used by
the code increase until the OOMkiller stop the job. On the cluster we
use IB QDR for communications.

Same Gcc/Gfortran 7.3 (built from sources), same sources of OpenMPI (3.1
or 4.0.5 tested), same sources of the fortran code on the laptop and on
the cluster.

Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not show the
problem (resident memory do not increase and we ran 10 temporal
iterations)

MPI_type_free manual says that it "/Marks the datatype object associated
with datatype for deallocation/". But  how can I check that the
deallocation is really done ?

Thanks for ant suggestions.

Patrick