Hi,

Several years ago, I developed a Massif patch to let the client code select which part of the code should be profiled with Massif. This is particulary useful for scientific simulation codes which often have two phases during the run :

  1) initialisation's phase
  2) main computational loop (heavy CPU load)

During phase 1, one allocates a lot of memory (reading meshes, creating variables, ...) but the critical part is the main loop : bogus codes slightly increase memory consumption but the increase due to one iteration of the loop is tiny compared to the total amount of memory allocated before the main loop. Since the memory is freed after the loop, this is not a leak, but the code can run out of memory because there are thousands and thousands of iterations.

To detect the problem with Massif, one has to let the code run many iterations of the loop under valgrind ... and it's way too slow. The patch I've made ease the detection of such a problem within just a few iterations. It consists in two new command line options and three client's requests :

--record-from-start=yes/no : this disables heap profiling until Massif meets the client's request which tells it to start profiling the heap (see below).

--disable-auto-snapshots=yes/no : this disables all the snapshots except the ones that are explicitly asked in a client's request.

VALGRIND_START_MEM_RECORDING : this request tells Massif to start heap profiling.

VALGRIND_STOP_MEM_RECORDING : this request tells Massif to stop heap profiling.

VALGRIND_TAKE_DETAILED_SNAPSHOT : this request tells Massif to take a detailed snapshot.

So using VALGRIND_START_MEM_RECORDING just before the main loop and VALGRIND_TAKE_DETAILED_SNAPSHOT at the beginning of the loop, Massif can report exactly what is going on during the main loop. As an example, I made a regression test which simulates that problem (big initial allocation and then tiny allocations in a loop).

Massif report without the patch:

    MB
1.001^ :
     | #::::::::::::::::::::::::::::::::::::::::::::::::::@:::::::::
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
     |          #:::::::::::: :::: :::: ::: ::::::::::: :::: ::: ::@
   0 +----------------------------------------------------------------------->MB
0 7.056

Massif report with the patch:

     B
216^ @
| @
| @
| @@@@@@@@
| @      @
| @@@@@@@@@      @
|                                                        @ @      @
     | @@@@@@@@@       @      @
     | @       @       @      @
     | @@@@@@@@@       @       @      @
     |                                        @ @       @       @      @
     |                                        @ @       @       @      @
     |                                @@@@@@@@@ @       @       @      @
     |                                @       @ @       @       @      @
     |                        @@@@@@@@@       @ @       @       @      @
     |                        @       @       @ @       @       @      @
     |                @@@@@@@@@       @       @ @       @       @      @
     |                @       @       @       @ @       @       @      @
     |        @@@@@@@@@       @       @       @ @       @       @      @
     |        @       @       @       @       @ @       @       @      @
   0 +----------------------------------------------------------------------->MB
0 4.550

I've developed the patch in a GIT branch that I've just rebased on the master. I've updated the docs (NEWS, manual, ...) and created a regression test for it in massif/tests. I think this feature could interest other developers of scientific codes (even if, I think, DHAT can now help with such issue), so if you (valgrind developers) think it could be interesting to take a look at it, let me know : I can send you the patch or do a pull request.

Thanks,
Loïc




_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to