Re: [FEniCS] Memory consumption of mesh

Garth N. Wells Thu, 16 Jan 2014 00:47:21 -0800

On 2014-01-15 22:36, Mikael Mortensen wrote:

I have had a few testruns and compared my memory usage to a
Navier-Stokes solver from OpenFOAM. Some results thus far for a box of
size 40**3 are (using /proc/pid/statm to measure memory usage, see
https://github.com/mikaem/fenicstools/blob/master/fem/getMemoryUsage.cpp
[2])


1 CPU
OpenFOAM: 324 MB RSS (443 MB VM)
Fenics: 442 MB RSS (mesh 16 MB) + 60 MB for importing dolfin (1.0 GB
VM)

2 CPUs
OpenFOAM: 447 MB (1.6 GB VM)
Fenics: 520 MB (mesh 200 MB) + 120 MB for importing dolfin (1.7 GB VM)

4 CPUs
OpenFOAM: 540 MB RSS (2.0 GB VM)
Fenics: 624 MB (mesh 215 MB) + 250 MB for importing dolfin (3.0 GB VM)


Looks pretty good.

This is after Garth's removal of the D--0--D connectivity and I'm not
attempting to clean out topology or anything after the mesh has been
created. I'm actually very happy with these results:-) More CPUs
require more memory, but it increases more for OpenFOAM than for
Fenics and the difference is not all that much. Importing dolfin is a
considerable part of the total cost, but the cost does not increase
with mesh size so it's not of much concern (also, this depends on how
many dependencies you carry around). I added virtual memory in
parenthesis, but I'm not sure what to make of it.

Some other things I have learned (very) recently:

1) Clearing out memory during runtime is apparently not straight
forward because, in an effort to optimize speed (allocation is slow),
the system may choose to keep you previously allocated memory close
(in RAM). Even with the swap trick the operating system may actually
choose to keep the allocated memory available and not free it (see,
e.g.,
http://stackoverflow.com/questions/6526983/free-memory-from-a-long-vector
[3]). Apparently there is a compiler option GLIBCXX_FORCE_NEW that is
intended to disable this caching of memory, but I have not been able
to make it work. With the swap trick memory is definitely freed quite
efficiently, but it is frustrating trying to figure out where memory
goes when we are not in total control of allocation and deallocation.
One example - I use project to initialize a Function. During project
rss memory increases substantially due to, e.g., a matrix being
created behind the scenes. One would expect that this memory is freed
after finishing, but that does not always happen. The additional
memory may be kept in RAM and used later. If, for example, I create a
matrix just after project, then memory use may actually not increase
at all, because the OS can just reuse the memory it did not free. I
guess this is something one has to live with using dynamic memory
allocation? I really know very little of how these things works, so it
would be great if someone who does could comment?

I wouldn't worry about this. It can be hard to measure memory via the OSreporting tools because of the way the OS manages memory, as youdescribe above. Just leave it to the OS. You could try KCachegrind tomeasure and visualise where memory is being used in DOLFIN.

2) Both MeshGeometry.clear() and MeshTopology.clear() should probably
make use of the swap trick instead of std::vector<>.clear(). At least
then the memory ´may´ be freed when clearing the mesh.

We could bite the bullet and start using C++11. It has a shrink_to_fitfunction:


   http://en.cppreference.com/w/cpp/container/vector/shrink_to_fit

This would be useful when we clear a container and when we know that nomore entries will be added.

3) Memory use for creating a parallel UnitCubeMesh(40, 40, 40) goes
like this:

 16 MB in creating the serial mesh
 56 MB in creating LocalMeshData class
 41 MB in creating cell partition in
MeshPartitioning::build_distributed_mesh
 101 MB in creating mesh from LocalMeshData and cell partitioning in
MeshPartitioning::build

No memory is freed after the mesh has been created, even though it
looks to me like LocalMeshData runs out of scope?

LocalMeshData should be destroyed. Take a look with KCachegrind. Thereis also gperftools:


    https://code.google.com/p/gperftools/

for memory profiling.

Mikael

PS Not really relevant to this test, but my NS-solver is 20% faster
than OpenFOAM for this channel-flow test-case:-)

Nice. On the same mesh? Does your solver have an extra order of accuracyover OpenFOAM?


Garth

Den Jan 15, 2014 kl. 10:48 AM skrev Garth N. Wells:

On 2014-01-15 07:13, Anders Logg wrote:

On Tue, Jan 14, 2014 at 08:08:50PM +0000, Garth N. Wells wrote:

On 2014-01-14 19:28, Anders Logg wrote:

On Tue, Jan 14, 2014 at 05:19:55PM +0000, Garth N. Wells wrote:

On 2014-01-14 16:24, Anders Logg wrote:

On Mon, Jan 13, 2014 at 07:16:01PM +0000, Garth N. Wells wrote:

On 2014-01-13 18:42, Anders Logg wrote:

On Mon, Jan 13, 2014 at 12:45:11PM +0000, Garth N. Wells

wrote:

I've just pushed some changes to master, which for

from dolfin import *

mesh = UnitCubeMesh(128, 128, 128)

mesh.init(2)

mesh.init(2, 3)

g>

give a factor 2 reduction in memory usage and a factor 2

speedup.

Change is at

https://bitbucket.org/fenics-project/dolfin/commits/8265008

[1].

The improvement is primarily due to d--d connectivity no

being

computed. I'll submit a pull request to throw an error when

d--d

connectivity is requested. The only remaining place where

d--d is

(implicitly) computed is in the code for finding constrained

mesh

entities (e.g., for periodic bcs). The code in question is

for (MeshEntityIterator(facet, dim) e; . . . .)

when dim is the same as the topological dimension if the

facet. As

in other places, it would be consistent (and the original

intention

of the programmer) if this just iterated over the facet

itself

rather than all facets attached to it via a vertex.

I don't see why an error message is needed. Could we not just

add the

possibility to specify what d -- d means? It might be useful

for other

algorithms to be able to compute that data.

I think it should be removed because it's (i) ad-hoc, (ii) is

not

used/required in the library and (iii) is a memory monster.

Moreover, we have dimension-independent algorithms that work

when

d--d connectivity is a connection from an entity to itself

(for

which we don't need computation and memory eating). We

shouldn't

have an unnecessary, memory monster d-0-d data structure being

created opaquely for no purpose, which is what what

for (MeshEntityIterator e(entity, dim); . . . .){....}

does at present when (dimension of entity) = dim. The

excessive

memory usage is an issue for big problems.

If a user wants d-0-d it can be built explicitly, which makes

both

the purpose and the intention clear.

How can it be built explicitly without calling mesh.init(d, d)?

Build d--0, then 0--d:

for (MeshEntityIterator e0(mesh, d); .....)

for (VertexIterator v(*e0); ....)

for (MeshEntityIterator e2(*v, d); .....)

Yes, that's even more explicity but since we already have a

function

that computes exactly that, I don't see the urge to remove it.

Note that d--0--d is no longer used in DOLFIN (except by

accident in

finding periodic bcs because of inconsistent behind-the-scenes

behaviour, and it should be changed because it's expensive and

not

required for periodic bcs).

Not true - see line 67 of Extrapolation.cpp.

That code is not covered by any test or demo.

Yes it is, by a unit test under test/unit/adaptivity/ and all the

auto-adaptive demos, including the documented demo
auto-adaptive-poisson.


OK. I don't know why it didn't trigger a failure before when I added
an error when d0==d1 and ran the tests.

Having easy access to

neighboring cells (however one chooses to define what it means to

be a

cell neighbor of a cell) can be useful to many algorithms that

perform

local search algorithms or solve local problems.

Sure, but again the user should decide how a connection is
defined.

Yes, I agree.

Again, I don't see the urge to remove this function just because

it

consumes a lot of memory. The important point is to avoid using

it for

standard algorithms in DOLFIN, like boundary conditions.

That's not a good argument. If poor/slow/memory hogging code is
left

in the library, it eventually gets used in the library despite any

warnings. It's happened time and time again.

I have never understood this urge to remove all functionality that
can

potentially be misused. We are adults so it should be enough to
write

a warning in the documentation.


This has proven not to be enough in the past. Moreover, it is good to
make it easy for users to write good code by not providing functions
to are easily misused, and it's good to make it easy to write code
that doesn't only work well for small serial cases.

Garth

The proper

way to fix functions that should not use this particular function

(in

for example periodic bcs) would be to create an issue for its

misuse.

The periodic bc code does not mis-use the function. The flaw is
that

the mesh library magically and unnecessarily creates a memory

monster behind the scenes.

I agree with that.

--

Anders


_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics



Links:
------
[1] https://bitbucket.org/fenics-project/dolfin/commits/8265008

[2]https://github.com/mikaem/fenicstools/blob/master/fem/getMemoryUsage.cpp[3]http://stackoverflow.com/questions/6526983/free-memory-from-a-long-vector

_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Re: [FEniCS] Memory consumption of mesh

Reply via email to