Hello,

Thanks for the fast answer to my problem. I played around with the different
Compressed Sparsity Patterns. As a test case I used the same snippet of code
as in my previous email and put an infinite loop at the end (see below) so
that I can check the memory consumption by the "top" command in the shell. 

1 {
2      BlockCompressedSetSparsityPattern csp (2,2);
3  ...
4     DoFTools::make_sparsity_pattern (stokes_dof_handler, coupling,
5  csp);
6  ...
7 }
8 Do{}while(true);

If I use BlockCompressedSparsityPattern I get 0.7% memory consumption (total
amount 32GB). If I use BlockCompressedSetSparsityPattern I get 6.8% memory
consumption. Shouldn't the memory be the same after the destructor of the
sparsity patterns has been called, in line 7? If this is not the case, then
I think that I do not really understand what the sparsity pattern is doing
and which object has allocated the memory.

Could you help me?

Thanks in advance
Florian   




-----Ursprüngliche Nachricht-----
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag
von Martin Kronbichler
Gesendet: Donnerstag, 16. Oktober 2008 14:10
An: [email protected]
Betreff: Re: [deal.ii] Parallelization

Dear Florian,

> {
>      BlockCompressedSetSparsityPattern csp (2,2);
>  ...
>     DoFTools::make_sparsity_pattern (stokes_dof_handler, coupling,
> csp);
>  ...
>    }
> 
>  
> 
> I found out that the memory increases during the call
> DoFTools::make_sparsity_pattern (stokes_dof_handler, coupling, csp);
> 
> But when I check the memory consumption of the dof_handler it seems to
> remain the same. So which object needs so much memory (because the
> memory of csp should be released at the end of the function)?

> Is there a possibility to circumvent this huge memory consumption, so
> that I have got a chance to run my program for larger problems?

I think your problem is the way the sparsity pattern is constructed by
the CompressedSetSparsityPattern class. You should be able to use many
more dofs if you use BlockCompressedSparsityPattern instead, which is
however much slower. T. Heister has recently implemented a class
CompressedSimpleSparsityPattern which could be worth a try, because it
seems to use not that much more memory than CompressedSparsityPattern,
but should be significantly faster (I'm not sure whether all interfaces
to it are already working)...

Then there is another point (at least in the current version of
step-32): As the Trilinos interfaces are implemented now, each processor
will have to own the full sparsity pattern in order to create a
distributed matrix. As step-32 does this right now - namely generating
the whole sparsity pattern on all processors at the same time - makes
things even worse since on usual clusters up to 8 processors share the
memory, which decreases the available memory per thread by a factor of 4
or 8. Eventually this should be improved, but I don't know if anyone has
looked at it or if there is a solution on the distributed_grids branch
that could easily be used here... 

Best,
Martin




_______________________________________________


_______________________________________________

Reply via email to