On Mon, 1 Sep 2008, Tim Kroeger wrote:

> Since I didn't get any reply yet, I am not sure whether you got the mail 
> below.  On the other hand, perhaps you just didn't answer because you found 
> there was nothing to say.

Actually, neither was the case - I didn't answer because had very
little time to check email over the last week, and a proper answer to
yours required a little time to look over things.

> Anyway, the thing remains a major problem for me, in particular because the 
> computation keeps running out of memory for the number of elements I really 
> want to use -- no matter how many CPUs I am using. (I am not completely sure 
> whether this is really due to the serial vector in EquationSystems::reinit(), 
> but I have no other idea what the reason could be.)

We instantiate a couple other serial vectors in a typical code (the
current_local_solution and its value at the previous timestep) - these
don't have the CPU time scalability issues that the
System::project_vector temporary does because only O(N/Nproc) of their
entries are regularly accessed, but they have the same memory
scalability problems.

Typically the biggest problem for memory scalability is the
SerialMesh - a coefficient or two per degree of freedom is still less
than the many pointers per element and per node that our unstructured
mesh class requires.  Unfortunately ParallelMesh probably won't work
for you yet - it's not well tested in general and it's definitely got
remaining bugs in certain adaptive coarsening cases.

> Unfortunately, I feel not able to restructure that projection method
> myself, since I am not familiar with that part of libMesh.  However,
> if you give me some advice (where to look for similar code etc.), I
> might try it.

For fixing the runtime scalability of project_vector, I'm actually
working on the easiest improvement to that right now: we'll keep
creating the serial vector, but localize to it with a properly built
send_list instead of doing a global localization.  This is the same
thing we do with the other serial vectors, and it should be
sufficient.  We want O(N/Nproc) allocation eventually, but O(N/Nproc)
communication is more pressing.

I'll let you know when I commit that to SVN - because your code seems
to be hitting this bottleneck most strongly I'd appreciate it if you
would help with benchmarking/debugging.

Note to Ben: this is going to be a sufficiently complex change that I
don't think it should go into 0.6.3.  So depending on whether that
mesh I/O problem I found was a regression or just a corrupted xdr
file, I think we should either backport the fix or re-label the
0.6.3-rc1 as 0.6.3 final.

Tim: although I wouldn't recommend digging into System::project_vector
or into ParallelMesh, the one remaining obstacle to O(N/Nproc) memory
scalability in libMesh is those serial vectors.  We currently allocate
a global vector then only fill the parts of it that correspond to
local and ghost dofs - simply because we don't have the kind of
"SparseVector" data structure that would be necessary to do that
efficiently.  Ben tells me that PETSc has something reasonable
available (using for internal storage a single block for local
coefficients plus a sparse structure for ghost coefficients), but
we'd need a libMesh interface to that (while maintaining compatibility
with LASPACK, Trilinos, and our internal vector formats!)  If you're
looking to volunteer for something, this is the one place where our
scalability really needs improvement but where nobody's currently
working on it.

> I hope this information is sufficient for you.  If not, please let
> me know what else you need.

I could use some more fine-grained perf logging.  You don't need to
create your own PerfLog object; using the global log with
START_LOG/STOP_LOG would be fine.  While your current test establishes
that the combination of distribute_dofs, create_dof_constraints, and
prolong_vectors is responsible for your poor performance, I'd like to
verify that System::project_vector is the problem and that the
localization in particular is what's not scaling.  In particular, try
wrapping a log around lines 79 through 107 of system_projection.C and
make sure that the scalability failure is there.
---
Roy

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to