Re: Memory Leaks with Trilinos

2016-10-11 Thread Michael Waters

Hi Trevor,

My mystery deepens. Today, I've compiled and tried the following 
combinations of Trilinos/Swig:


12.8.1/3.0.2

11.14.3/3.0.2

11.12.1/3.0.2

11.10.1/2.0.8

11.2.5/2.0.8

All of these combinations leak memory at about the same rate. I am using 
Boost 1.54, Openmpi 1.6.5, and GCC 4.8.4. Are you using similar?


Thanks,

-Mike


On 03/30/2016 05:06 PM, Keller, Trevor (Fed) wrote:

Mike & Jon,

Running 25 steps on Debian wheezy using PyTrilinos 11.10.2 and swig 2.0.8, 
monitoring with memory-profiler, I do not see a memory leak (see attachment): 
the simulation is fairly steady at 8GB RAM. The same code, but using PyTrilinos 
12.1.0, ramps up to 16GB in the same simulation time.

Later revisions of Trilinos 12 remain defiant (though, compiling swig-3.0.8 has 
improved matters). If one of them compiles and works, I will let you know. If 
not, if possible, Mike, can you test the next incremental version (11.10.2)?

Trevor



From: fipy-boun...@nist.gov  on behalf of Michael Waters 

Sent: Wednesday, March 30, 2016 4:36 PM
To: FIPY
Subject: Re: Memory Leaks with Trilinos

Hi Jon,

I just compiled an old version of swig (2.0.8) and compiled Trilinos
(11.10.1) against that. Sadly, I am still having the leak.

I am out of ideas for the day... and should be looking for a post doc
anyway.

Thanks,
-mike


On 3/30/16 3:32 PM, Guyer, Jonathan E. Dr. (Fed) wrote:

No worries. If building trilinos doesn't blindside you with something 
unexpected and unpleasant, you're not doing it right.

I have a conda recipe at 
https://github.com/guyer/conda-recipes/tree/trilinos_upgrade_11_10_2/trilinos 
that has worked for me to build 11.10.2 on both OS X and Docker (Debian?). I 
haven't tried to adjust it to 12.x, yet.


On Mar 30, 2016, at 2:42 PM, Michael Waters  wrote:


Hi Jon,

I was just reviewing my version of Trilinos 11.10 and discovered that there is 
no way that I compiled it last night after exercising. It has unsatisfied 
dependencies on my machine. So I must apologize, I must have been more tired 
than I thought.

Sorry for the error!
-Mike Waters

On 3/30/16 11:52 AM, Guyer, Jonathan E. Dr. (Fed) wrote:

It looked to me like steps and accuracy were the way to do it, but my runs 
finish in one step, so I was confused. When I change to accuracy = 10.0**-6, it 
takes 15 steps, but still no leak (note, the hiccup in RSS and in ELAPSED time 
is because I put my laptop to sleep for awhile, but VSIZE is rock-steady).

The fact that things never (or slowly) converge for you and Trevor, in addition 
to the leak, makes me wonder if Trilinos seriously broke something between 11.x 
and 12.x. Trevor's been struggling to build 12.4. I'll try to find time to do 
the same.

In case it matters, I'm running on OS X. What's your system?

- Jon

On Mar 29, 2016, at 3:59 PM, Michael Waters  wrote:


When I did my testing and made those graphs, I ran Trilinos in serial.
Syrupy didn't seem to track the other processes memory. I watched in
real time as the parallel version ate all my ram though.

To make the program run longer while not changing the memory:

steps = 100  # increase this, (limits the number of self-consistent
iterations)
accuracy = 10.0**-5 # make this number smaller, (relative energy
eigenvalue change for being considered converged )
initial_solver_iterations_per_step = 7 # reduce this to 1,  (number of
solver iterations per self-consistent iteration, to small and it's slow,
to high and the solutions are not stable)

I did those tests on a machine with 128 GB of ram so I wasn't expecting
any swapping.

Thanks,
-mike


On 3/29/16 3:38 PM, Guyer, Jonathan E. Dr. (Fed) wrote:

I guess I spoke too soon. FWIW, I'm running Trilinos version: 11.10.2.


On Mar 29, 2016, at 3:34 PM, Guyer, Jonathan E. Dr. (Fed) 
 wrote:


I'm not seeing a leak. The below is for trilinos. VSIZE grows to about 11 MiB 
and saturates and RSS saturates at around 5 MiB. VSIZE is more relevant for 
tracking leaks, as RSS is deeply tied to your system's swapping architecture 
and what else is running; either way, neither seems to be leaking, but this 
problem does use a lot of memory.

What do I need to do to get it to run longer?



On Mar 25, 2016, at 7:16 PM, Michael Waters  wrote:


Hello,

I still have a large memory leak when using Trilinos. I am not sure where to 
start looking so I made an example code that produces my problem in hopes that 
someone can help me.

But! my example is cool. I implemented Density Functional Theory in FiPy!

My code is slow, but runs in parallel and is simple (relative to most DFT 
codes). The example I have attached is just a lithium and hydrogen atom. The 
electrostatic boundary conditions are goofy but work well enough for 
demonstration purposes. If you set use_trilinos to True, the code will slowly 
use more memory. 

RE: Memory Leakage & Object Build-up with FiPy Sweeps

2016-10-11 Thread Campbell, Ian
Dear Jonathan, Daniel,

Thank you for your responses. Just yesterday, I discovered and solved this 
problem (another remains). It wasn't a result of calls to .sweep. The 
time-varying boundary condition for one of the PDEs was being re-defined within 
the time-stepping loop using the PDE.faceGrad.constrain() method. This led to a 
net creation of objects with every timestep, irrespective of garbage collection 
call frequency, and that in turn caused the slow-down of the simulation.

The solution was to instead update the value of the boundary condition by the 
.setValue method within the time-stepping loop, as below.

# Outside the loop, declare a FaceVariable for the value of the BC:
species_flux_neg_particle_surf = FaceVariable(mesh=p2d_mesh)
# Next, apply that value to the BC at the top of that mesh:
Cs_p2d.faceGrad.constrain(species_flux_neg_particle_surf, 
where=p2d_mesh.facesTop) 
.
# Within the time-stepping loop, update the boundary condition using .setValue 
as follows:
species_flux_neg_particle_surf.setValue(my_new_BC_value)
# Enjoy not creating new objects

I haven't yet had time to finish producing a new vprof memory consumption plot 
for comparison. However, it's clear from Pympler’s 
SummaryTracker().print_diff() function that this change to the way the BC is 
updated solved the memory leak issue.

For comparison, here are the number of objects and CPU time per timestep 
plotted against three seconds of simulation time, firstly using the 
.faceGrad.constrian() method and, secondly, using the .setValue() method for 
updating. The now-stable number of objects illustrates the fixed leak.
faceGrad.constrain, leaking: https://goo.gl/3LqSm7
setValue, memory leak fixed: https://goo.gl/6kQMjH

There is a new issue which also slows the simulation to an unusable level, 
described below.

With the memory leak solved, I was able to run the simulation well beyond three 
seconds, and discovered that the number of sweeps required per timestep begins 
to exponentially increase after around 120s of simulation time. It seems that 
this in turn pulls up the CPU time required per timestep. The plot at the 
following link illustrates the problem: https://goo.gl/G9DD5r

I do not know why this is. It's clear that a memory leak is no longer the cause 
- the number of objects is relatively constant (varying only slightly between 
garbage collector cycles). Plotting, at 1 & 150 seconds into the simulation, 
the residuals returned by the .sweep function for each of the six PDEs being 
solved in the time-stepping loop provides some insight into the stability of 
convergence. Each subplot in a figure is for one of the six PDEs: 
https://goo.gl/Nnm7Si

At 150s, the residuals are still decreasing with sweeping, but at a much slower 
rate, towards the tolerance (1e-4). Do you know why this might be happening?

With best regards,

 - Ian

-Original Message-
From: fipy-boun...@nist.gov [mailto:fipy-boun...@nist.gov] On Behalf Of Guyer, 
Jonathan E. Dr. (Fed)
Sent: 11 October 2016 16:37
To: FIPY 
Subject: Re: Memory Leakage & Object Build-up with FiPy Sweeps

I have access to their code. Ian, please provide an explicit recipe for 
demonstrating the leak with the code in your github repo.

- Jon

> On Oct 11, 2016, at 11:15 AM, Daniel Wheeler  
> wrote:
> 
> Hi Ian,
> 
> Could you possible post your code or a version of the code that demonstrates 
> the problem? Also, do you have the same issue with different solver suites?
> 
> Cheers,
> 
> Daniel
> 
> 
> 
> On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian  
> wrote:
> Hi All,
> 
>  
> 
> We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as CPU 
> time progresses, the duration of each time-step increases, although the sweep 
> count remains constant. This is illustrated in the Excel file of data logged 
> from the simulation, which is available at the first hyperlink below.
> 
>  
> 
> Hence, we suspected a memory leak may be occurring. After conducting 
> memory-focused line-profiling with the vprof tool, we observed a linear 
> increase in total memory consumption at a rate of approximately 3 MB per 
> timestep loop. This is evident in the graph at the second link below, which 
> illustrates the memory increase over three seconds of simulation.
> 
>  
> 
> As a further step, we used Pympler to investigate the source of RAM 
> consumption increase for each timestep. The table below is an output from 
> Pympler’s SummaryTracker().print_diff(), which describe the additional 
> objects created within every time-step. Clearly, there are ~3.2 MB of 
> additional data being generated with every loop – this correlates perfectly 
> with the total rate of increase of memory consumption reported by vprof. 
> Although we are not yet sure, we suspect that the increasing time spent per 
> loop is the result of this apparent memory leak.
> 
>  
> 
> We suspect this is the result of the 

Re: Memory Leakage & Object Build-up with FiPy Sweeps

2016-10-11 Thread Michael Waters
Hi all,

Could Ian's issue be related to the issues I came across at the end of 
March? Trevor Keller seems to have isolated my memory leak to versions 
of Trilinos newer than 12.0.

Trevor, It seems that I never got around to testing an older version of 
Trilinos, I'll do that now.

-Mike


On 10/11/2016 10:36 AM, Guyer, Jonathan E. Dr. (Fed) wrote:
> I have access to their code. Ian, please provide an explicit recipe for 
> demonstrating the leak with the code in your github repo.
>
> - Jon
>
>> On Oct 11, 2016, at 11:15 AM, Daniel Wheeler  
>> wrote:
>>
>> Hi Ian,
>>
>> Could you possible post your code or a version of the code that demonstrates 
>> the problem? Also, do you have the same issue with different solver suites?
>>
>> Cheers,
>>
>> Daniel
>>
>>
>>
>> On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian 
>>  wrote:
>> Hi All,
>>
>>   
>>
>> We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as CPU 
>> time progresses, the duration of each time-step increases, although the 
>> sweep count remains constant. This is illustrated in the Excel file of data 
>> logged from the simulation, which is available at the first hyperlink below.
>>
>>   
>>
>> Hence, we suspected a memory leak may be occurring. After conducting 
>> memory-focused line-profiling with the vprof tool, we observed a linear 
>> increase in total memory consumption at a rate of approximately 3 MB per 
>> timestep loop. This is evident in the graph at the second link below, which 
>> illustrates the memory increase over three seconds of simulation.
>>
>>   
>>
>> As a further step, we used Pympler to investigate the source of RAM 
>> consumption increase for each timestep. The table below is an output from 
>> Pympler’s SummaryTracker().print_diff(), which describe the additional 
>> objects created within every time-step. Clearly, there are ~3.2 MB of 
>> additional data being generated with every loop – this correlates perfectly 
>> with the total rate of increase of memory consumption reported by vprof. 
>> Although we are not yet sure, we suspect that the increasing time spent per 
>> loop is the result of this apparent memory leak.
>>
>>   
>>
>> We suspect this is the result of the calls to .sweep, since we are not 
>> explicitly creating these objects. Can the origin of these objects be 
>> traced, and furthermore, is there a way to avoid re-creating them and 
>> consuming more memory with every loop?  Without some method of unloading or 
>> preventing this object build-up, it isn’t feasible to run our simulation for 
>> long durations.
>>
>>
>> dict
>>
>> 2684
>>
>> 927.95
>>
>> KB
>>
>> type
>>
>> 1716
>>
>> 757.45
>>
>> KB
>>
>> tuple
>>
>> 9504
>>
>> 351.31
>>
>> KB
>>
>> list
>>
>> 4781
>>
>> 227.09
>>
>> KB
>>
>> str
>>
>> 2582
>>
>> 210.7
>>
>> KB
>>
>> numpy.ndarray
>>
>> 396
>>
>> 146.78
>>
>> KB
>>
>> cell
>>
>> 3916
>>
>> 107.08
>>
>> KB
>>
>> property
>>
>> 2288
>>
>> 98.31
>>
>> KB
>>
>> weakref
>>
>> 2287
>>
>> 98.27
>>
>> KB
>>
>> function (getName)
>>
>> 1144
>>
>> 67.03
>>
>> KB
>>
>> function (getRank)
>>
>> 1144
>>
>> 67.03
>>
>> KB
>>
>> function (_calcValue_)
>>
>> 1144
>>
>> 67.03
>>
>> KB
>>
>> function (__init__)
>>
>> 1144
>>
>> 67.03
>>
>> KB
>>
>> function (_getRepresentation)
>>
>> 1012
>>
>> 59.3
>>
>> KB
>>
>> function (__setitem__)
>>
>> 572
>>
>> 33.52
>>
>> KB
>>
>> SUM
>>
>> 3285.88
>>
>> KB
>>
>>   
>>
>>   
>>
>> https://imperialcollegelondon.box.com/s/zp9jj67du3mxdcfgbc4el8cqpxwnv0y4
>>
>>   
>>
>> https://imperialcollegelondon.box.com/s/ict9tnswqk9z57ovx8r3ll5po5ccrib9
>>
>>   
>>
>> With best regards,
>>
>>   
>>
>> -  Ian & Krishna
>>
>>   
>>
>> P.S. Daniel, thank you very much for the excellent example solution you 
>> provided in response to our question on obtaining the sharp discontinuity.
>>
>>   
>>
>> Ian Campbell | PhD Candidate
>>
>> Electrochemical Science & Engineering Group
>>
>> Imperial College London, SW7 2AZ, United Kingdom
>>
>>   
>>
>>
>> ___
>> fipy mailing list
>> fipy@nist.gov
>> http://www.ctcms.nist.gov/fipy
>>[ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
>>
>>
>>
>>
>> -- 
>> Daniel Wheeler
>> ___
>> fipy mailing list
>> fipy@nist.gov
>> http://www.ctcms.nist.gov/fipy
>>   [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
>
> ___
> fipy mailing list
> fipy@nist.gov
> http://www.ctcms.nist.gov/fipy
>[ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

___
fipy mailing list
fipy@nist.gov
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]


Re: Memory Leakage & Object Build-up with FiPy Sweeps

2016-10-11 Thread Guyer, Jonathan E. Dr. (Fed)
I have access to their code. Ian, please provide an explicit recipe for 
demonstrating the leak with the code in your github repo.

- Jon

> On Oct 11, 2016, at 11:15 AM, Daniel Wheeler  
> wrote:
> 
> Hi Ian,
> 
> Could you possible post your code or a version of the code that demonstrates 
> the problem? Also, do you have the same issue with different solver suites?
> 
> Cheers,
> 
> Daniel
> 
> 
> 
> On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian  
> wrote:
> Hi All,
> 
>  
> 
> We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as CPU 
> time progresses, the duration of each time-step increases, although the sweep 
> count remains constant. This is illustrated in the Excel file of data logged 
> from the simulation, which is available at the first hyperlink below.
> 
>  
> 
> Hence, we suspected a memory leak may be occurring. After conducting 
> memory-focused line-profiling with the vprof tool, we observed a linear 
> increase in total memory consumption at a rate of approximately 3 MB per 
> timestep loop. This is evident in the graph at the second link below, which 
> illustrates the memory increase over three seconds of simulation.
> 
>  
> 
> As a further step, we used Pympler to investigate the source of RAM 
> consumption increase for each timestep. The table below is an output from 
> Pympler’s SummaryTracker().print_diff(), which describe the additional 
> objects created within every time-step. Clearly, there are ~3.2 MB of 
> additional data being generated with every loop – this correlates perfectly 
> with the total rate of increase of memory consumption reported by vprof. 
> Although we are not yet sure, we suspect that the increasing time spent per 
> loop is the result of this apparent memory leak.
> 
>  
> 
> We suspect this is the result of the calls to .sweep, since we are not 
> explicitly creating these objects. Can the origin of these objects be traced, 
> and furthermore, is there a way to avoid re-creating them and consuming more 
> memory with every loop?  Without some method of unloading or preventing this 
> object build-up, it isn’t feasible to run our simulation for long durations.
> 
> 
> dict
> 
> 2684
> 
> 927.95
> 
> KB
> 
> type
> 
> 1716
> 
> 757.45
> 
> KB
> 
> tuple
> 
> 9504
> 
> 351.31
> 
> KB
> 
> list
> 
> 4781
> 
> 227.09
> 
> KB
> 
> str
> 
> 2582
> 
> 210.7
> 
> KB
> 
> numpy.ndarray
> 
> 396
> 
> 146.78
> 
> KB
> 
> cell
> 
> 3916
> 
> 107.08
> 
> KB
> 
> property
> 
> 2288
> 
> 98.31
> 
> KB
> 
> weakref
> 
> 2287
> 
> 98.27
> 
> KB
> 
> function (getName)
> 
> 1144
> 
> 67.03
> 
> KB
> 
> function (getRank)
> 
> 1144
> 
> 67.03
> 
> KB
> 
> function (_calcValue_)
> 
> 1144
> 
> 67.03
> 
> KB
> 
> function (__init__)
> 
> 1144
> 
> 67.03
> 
> KB
> 
> function (_getRepresentation)
> 
> 1012
> 
> 59.3
> 
> KB
> 
> function (__setitem__)
> 
> 572
> 
> 33.52
> 
> KB
> 
> SUM
> 
> 3285.88
> 
> KB
> 
>  
> 
>  
> 
> https://imperialcollegelondon.box.com/s/zp9jj67du3mxdcfgbc4el8cqpxwnv0y4
> 
>  
> 
> https://imperialcollegelondon.box.com/s/ict9tnswqk9z57ovx8r3ll5po5ccrib9
> 
>  
> 
> With best regards,
> 
>  
> 
> -  Ian & Krishna
> 
>  
> 
> P.S. Daniel, thank you very much for the excellent example solution you 
> provided in response to our question on obtaining the sharp discontinuity.
> 
>  
> 
> Ian Campbell | PhD Candidate
> 
> Electrochemical Science & Engineering Group
> 
> Imperial College London, SW7 2AZ, United Kingdom
> 
>  
> 
> 
> ___
> fipy mailing list
> fipy@nist.gov
> http://www.ctcms.nist.gov/fipy
>   [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
> 
> 
> 
> 
> -- 
> Daniel Wheeler
> ___
> fipy mailing list
> fipy@nist.gov
> http://www.ctcms.nist.gov/fipy
>  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]


___
fipy mailing list
fipy@nist.gov
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]


Re: Memory Leakage & Object Build-up with FiPy Sweeps

2016-10-11 Thread Daniel Wheeler
Hi Ian,

Could you possible post your code or a version of the code that
demonstrates the problem? Also, do you have the same issue with different
solver suites?

Cheers,

Daniel



On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian  wrote:

> Hi All,
>
>
>
> We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as
> CPU time progresses, the duration of each time-step increases, although the
> sweep count remains constant. This is illustrated in the Excel file of data
> logged from the simulation, which is available at the first hyperlink below.
>
>
>
> Hence, we suspected a memory leak may be occurring. After conducting
> memory-focused line-profiling with the vprof tool, we observed a linear
> increase in total memory consumption at a rate of approximately 3 MB per
> timestep loop. This is evident in the graph at the second link below, which
> illustrates the memory increase over three seconds of simulation.
>
>
>
> As a further step, we used Pympler to investigate the source of RAM
> consumption increase for each timestep. The table below is an output from
> Pympler’s SummaryTracker().print_diff(), which describe the additional
> objects created within every time-step. Clearly, there are ~3.2 MB of
> additional data being generated with every loop – this correlates perfectly
> with the total rate of increase of memory consumption reported by vprof.
> Although we are not yet sure, we suspect that the increasing time spent per
> loop is the result of this apparent memory leak.
>
>
>
> We suspect this is the result of the calls to .sweep, since we are not
> explicitly creating these objects. Can the origin of these objects be
> traced, and furthermore, is there a way to avoid re-creating them and
> consuming more memory with every loop?  Without some method of unloading or
> preventing this object build-up, it isn’t feasible to run our simulation
> for long durations.
>
> dict
>
> 2684
>
> 927.95
>
> KB
>
> type
>
> 1716
>
> 757.45
>
> KB
>
> tuple
>
> 9504
>
> 351.31
>
> KB
>
> list
>
> 4781
>
> 227.09
>
> KB
>
> str
>
> 2582
>
> 210.7
>
> KB
>
> numpy.ndarray
>
> 396
>
> 146.78
>
> KB
>
> cell
>
> 3916
>
> 107.08
>
> KB
>
> property
>
> 2288
>
> 98.31
>
> KB
>
> weakref
>
> 2287
>
> 98.27
>
> KB
>
> function (getName)
>
> 1144
>
> 67.03
>
> KB
>
> function (getRank)
>
> 1144
>
> 67.03
>
> KB
>
> function (_calcValue_)
>
> 1144
>
> 67.03
>
> KB
>
> function (__init__)
>
> 1144
>
> 67.03
>
> KB
>
> function (_getRepresentation)
>
> 1012
>
> 59.3
>
> KB
>
> function (__setitem__)
>
> 572
>
> 33.52
>
> KB
>
> SUM
>
> 3285.88
>
> KB
>
>
>
>
>
> https://imperialcollegelondon.box.com/s/zp9jj67du3mxdcfgbc4el8cqpxwnv0y4
>
>
>
> https://imperialcollegelondon.box.com/s/ict9tnswqk9z57ovx8r3ll5po5ccrib9
>
>
>
> With best regards,
>
>
>
> -  Ian & Krishna
>
>
>
> P.S. Daniel, thank you very much for the excellent example solution you
> provided in response to our question on obtaining the sharp discontinuity.
>
>
>
> Ian Campbell | PhD Candidate
>
> Electrochemical Science & Engineering Group
>
> Imperial College London, SW7 2AZ, United Kingdom
>
>
>
> ___
> fipy mailing list
> fipy@nist.gov
> http://www.ctcms.nist.gov/fipy
>   [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
>
>


-- 
Daniel Wheeler
___
fipy mailing list
fipy@nist.gov
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]