Re: [OMPI users] MPI_type_free question

Gilles Gouaillardet via users Thu, 10 Dec 2020 01:18:29 -0800

Patrick,


First, thank you very much for sharing the reproducer.


Yes, please open a github issue so we can track this.


I cannot fully understand where the leak is coming from, but so far

- the code fails on master built with --enable-debug (the data enginereports an error) but not with the v3.1.x branch

(this suggests there could be an error in the latest Open MPI ... orin the code)

- the attached patch seems to have a positive effect, can you pleasegive it a try?



Cheers,


Gilles



On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:

Hi,
I've written a small piece of code to show the problem. Based on myapplication but 2D and using integers arrays for testing.The figure below shows the max RSS size of rank 0 process on 20000iterations on 8 and 16 cores, with openib and tcp drivers.The more processes I have, the larger the memory leak. I use the samebinaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).The code is in attachment. I'll try to check type deallocation as soonas possible.
Patrick




Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
Patrick,
based on George's idea, a simpler check is to retrieve the Fortranindex via the (standard) MPI_Type_c2() function
after you create a derived datatype.
If the index keeps growing forever even after you MPI_Type_free(),then this clearly indicates a leak.
Unfortunately, this simple test cannot be used to definitely rule outany memory leak.
Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...
in order to force communications over TCP/IP and hence rule out anymemory leak that could be triggered by your fast interconnect.
In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:
Patrick,
I'm afraid there is no simple way to check this. The main reasonbeing that OMPI use handles for MPI objects, and these handles arenot tracked by the library, they are supposed to be provided by theuser for each call. In your case, as you already calledMPI_Type_free on the datatype, you cannot produce a valid handle.
There might be a trick. If the datatype is manipulated with anyFortran MPI functions, then we convert the handle (which in fact isa pointer) to an index into a pointer array structure. Thus, theindex will remain used, and can therefore be used to convert backinto a valid datatype pointer, until OMPI completely releases thedatatype. Look into the ompi_datatype_f_to_c_table table to see thedatatypes that exist and get their pointers, and then use thesepointers as arguments to ompi_datatype_dump() to see if any of theseexisting datatypes are the ones you define.
George.
On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users<users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
    Hi,

    I'm trying to solve a memory leak since my new implementation of
    communications based on MPI_AllToAllW and MPI_type_Create_SubArray
    calls.  Arrays of SubArray types are created/destroyed at each
    time step and used for communications.

    On my laptop the code runs fine (running for 15000 temporal
    itérations on 32 processes with oversubscription) but on our
    cluster memory used by the code increase until the OOMkiller stop
    the job. On the cluster we use IB QDR for communications.

    Same Gcc/Gfortran 7.3 (built from sources), same sources of
    OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
    the laptop and on the cluster.

    Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
    show the problem (resident memory do not increase and we ran
    100000 temporal iterations)

    MPI_type_free manual says that it "/Marks the datatype object
    associated with datatype for deallocation/". But  how can I check
    that the deallocation is really done ?

    Thanks for ant suggestions.

    Patrick

diff --git a/ompi/mca/coll/basic/coll_basic_alltoallw.c 
b/ompi/mca/coll/basic/coll_basic_alltoallw.c
index 93fa880..5aca2c2 100644
--- a/ompi/mca/coll/basic/coll_basic_alltoallw.c
+++ b/ompi/mca/coll/basic/coll_basic_alltoallw.c
@@ -194,7 +194,7 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const int 
*scounts, const int *
             continue;
 
         prcv = ((char *) rbuf) + rdisps[i];
-        err = MCA_PML_CALL(irecv_init(prcv, rcounts[i], rdtypes[i],
+        err = MCA_PML_CALL(irecv(prcv, rcounts[i], rdtypes[i],
                                       i, MCA_COLL_BASE_TAG_ALLTOALLW, comm,
                                       preq++));
         ++nreqs;
@@ -215,21 +215,15 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const 
int *scounts, const int *
             continue;
 
         psnd = ((char *) sbuf) + sdisps[i];
-        err = MCA_PML_CALL(isend_init(psnd, scounts[i], sdtypes[i],
+        err = MCA_PML_CALL(send(psnd, scounts[i], sdtypes[i],
                                       i, MCA_COLL_BASE_TAG_ALLTOALLW,
-                                      MCA_PML_BASE_SEND_STANDARD, comm,
-                                      preq++));
-        ++nreqs;
+                                      MCA_PML_BASE_SEND_STANDARD, comm));
         if (MPI_SUCCESS != err) {
             ompi_coll_base_free_reqs(reqs, nreqs);
             return err;
         }
     }
 
-    /* Start your engines.  This will never return an error. */
-
-    MCA_PML_CALL(start(nreqs, reqs));
-
     /* Wait for them all.  If there's an error, note that we don't care
      * what the error was -- just that there *was* an error.  The PML
      * will finish all requests, even if one or more of them fail.
@@ -238,8 +232,6 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const int 
*scounts, const int *
      * error after we free everything. */
 
     err = ompi_request_wait_all(nreqs, reqs, MPI_STATUSES_IGNORE);
-    /* Free the requests in all cases as they are persistent */
-    ompi_coll_base_free_reqs(reqs, nreqs);
 
     /* All done */
     return err;

Re: [OMPI users] MPI_type_free question

Reply via email to