Re: [Libmesh-devel] SparsityPattern::Build::parallel_sync Over-estimating Nonzeros

Derek Gaston Wed, 26 Nov 2014 21:35:19 -0800

Nope - it doesn't appear to be at the processor boundary only... but I
would have to study it more.


I really think that with a more careful algorithm we could get a much
better estimate (if not the right answer).

Derek

On Thu, Nov 27, 2014 at 12:19 AM, Dmitry Karpeyev <dkarp...@gmail.com>
wrote:

>
> Presumably, this overestimation happens only at the "boundary" nodes i
> that are contained in elements living on other MPI ranks? Those foreign
> ranks will count couplings (edges)  i-j that are shared by their elements
> with the elements on rank p that owns i. Since only edge counts are
> communicated back to p, there is no way to eliminate these duplicates.
> Could we build a full sparsity pattern for such nodes _only_ ? That way the
> memory issues can be controlled, yet the duplicates would be eliminated.
> You would, however, need to communicate the edges, rather than their counts.
>
> Dmitry.
>
> On Wed, Nov 26, 2014, 22:34 Derek Gaston <fried...@gmail.com> wrote:
>
> Ben Spencer (copied on this email) pointed me to a problem he was having
> today with some of our sparsity pattern augmentation stuff.  It was causing
> PETSc to error out saying that the number of nonzeros on a processor was
> more than the number of entries on a row for that processor.  The weirdness
> is that this didn't happen if he just ran on one processor...
>
> Thinking that the problem was in our code (I believed we might have been
> double counting somewhere) I started tracing this problem tonight... and
> what I found is that libMesh is grossly overestimating the number of
> nonzeros per row when running in parallel.  And since our code is set up to
> believe that libMesh is producing the "perfect" number of nonzeros per row
> we are blindly adding to an already inflated number that pushes us past the
> size of the row...
>
> Here's what's happening when running on 8 processors (for this one DoF
> that I'm tracing... which is #168)
>
> 1.  DofMap::operator() is computing the correct number of nonzeros (in
> this case 60).
>
> I am taking the 60 number as being the correct number because that's the
> final number for this DoF when it's run in serial (I haven't actually dug
> in to see which DoF this is and manually compute the sparsity pattern...
> yet).
>
> Judging by this it seems that all of the dofs connected to #168 must be
> local (again, not completely verified... but the fact that
> DofMap::operator() comes up with the same number as the final number in
> serial is a good indicator).
>
> 2.  SparsityPattern::Build::parallel_sync() totally screws up.
>
> Putting print statements around line 3076 in dof_map.C I can see that n_nz
> for #168 goes up to 117! Even worse... n_oz ALSO goes up to 117!
>
>
>
> Remember: the correct number for n_nz + n_oz should be _60_.  So we are
> basically going to tell PETSc to set aside 4x as much memory for that row
> as is necessary.
>
> 3.  n_nz and n_oz get chopped down around line 3076 in dof_map.C...
>
> n_nz gets chopped so that it's the min(n_nz, n_dofs_on_proc).  In my case
> on that processor n_dofs_on_proc is 108 so n_nz gets set to that
>
> n_oz gets chopped so that it's min(n_oz, dofs_not_on_proc) you can see
> that that could be _very_ bad!
>
> 4.  Now the (overestimated) n_nz and n_oz get passed to MOOSE for
> modification and we start adding to n_nz/n_oz for dof couplings that
> libMesh definitely didn't know about... but since n_nz is sitting at the
> max possible already we blow past the number of dofs on this proc and then
> PETSc errors (like it should).
>
>
>
> So... my question is this: is this really the best "estimate" we can do in
> this case?
>
> This is a tiny problem in 3D with only 3 variables.  This will be MUCH
> worse if you have, say, 2000 variables... you could be telling PETSc to
> allocate ENORMOUS chunks of memory that are unnecessary.  I know that PETSc
> could throw a bunch of that memory away after the first filling... but we
> don't allow that in MOOSE because often we are pre-allocating for future
> connections.  But even if you were to let it do that it means that there
> could be a HUGE memory spike in the beginning until PETSc frees up a bunch
> of memory.
>
> It seems like this code is currently a worst-case estimate of what could
> happen.  It _does_ look like it might be better if we built the full
> sparsity pattern... but that has it's own memory problems.
>
>
>
> Also... it looks like there is a lot more parallel communication than
> necessary going on here.  We're sending large vectors of information from
> proc to proc... even in the case where we're not building a full sparsity
> pattern.  It seems like each processor could just send a minimal of "hey, I
> have this many dofs that are connected to these rows you own"... ie one
> scalar instead of a bunch of entries.
>
> So... should I take a stab at redoing some of this code?  I think that
> it's possible to get a much better estimate and do so with much less
> parallel communication.  I probably wouldn't mess with the code that does
> the full sparsity pattern... I would just remove the "non" full sparsity
> pattern code and make a different function that gets called if you're not
> building a full sparsity pattern.  That probably should be done either way
> (look at the huge "if" with duplicated code for each case in
> DofMap::operator() ).
>
> Or do one of you guys see a quick fix that does something better?
>
> (Oh - BTW, I'm going to implement the same use of min() in MOOSE's
> sparsity pattern augmentation stuff to get us through for right now - so
> this isn't necessarily time sensitive)
>
> Derek
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> Libmesh-devel mailing list
> Libmesh-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/libmesh-devel
>
>
>

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk

_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Re: [Libmesh-devel] SparsityPattern::Build::parallel_sync Over-estimating Nonzeros

Reply via email to