That graph is pretty awesome - thanks! I'm gonna have to digest that, but I 
think there could be some small room for improvement with a different data 
structure - if I interpret the steep ramp up to (global_index_map end) as map 
construction.

-Ben


On Oct 30, 2013, at 6:55 PM, "John Peterson" 
<jwpeter...@gmail.com<mailto:jwpeter...@gmail.com>> wrote:




On Wed, Oct 30, 2013 at 3:27 PM, John Peterson 
<jwpeter...@gmail.com<mailto:jwpeter...@gmail.com>> wrote:



On Wed, Oct 30, 2013 at 3:09 PM, Kirk, Benjamin (JSC-EG311) 
<benjamin.k...@nasa.gov<mailto:benjamin.k...@nasa.gov>> wrote:
Yeah, before I get too carried away I should probably just try running the 
existing code path twice:  Once as-is, and again actually commenting out the 
underlying Metis call, making the partitioner a big, expensive no-op.

Actually, John, if you have a chance could you rerun one of the cases you have 
data for, but just comment out the call to metis?  Hopefully the memory  usage 
will drop, verifying metis is the issue.

It should suffice to comment out the metis call, and add a

std::fill (part.begin(), part.end(), 0);

instead, provided its this simple stand-alone case where the mesh is not used!

Yep, I can certainly do that, but I think this is already verified just by 
looking at the difference in memory usage between Centroid/Linear/SFC 
Paritioner and Metis I posted in one of the prior emails this week.

Here's a link to a plot of total memory usage (across 2 procs) for the 200^3 
case, annoated at different points in the simulation:

https://drive.google.com/file/d/0B9BK7pg8se_iWmloaHNhOTJSNUE/edit?usp=sharing

The plot didn't quite include all the annotations I was expecting, but I do 
have some more precise numbers:

1. before/after building global_index_map: 6653660 -  5615440 K = 0.99 Gb 
total, half a gig/core

2. begin/end call to Metis: 7628896 - 7460828 = .16 Gb, we actually have 
slightly _more_ memory free when Metis finishes (plus/minus sampling error) so 
I don't think there are any major leaks in Metis

3. The ramp between the "global_index_map end" and "graph alloc" is the time 
when the graph is filled up and when the entries in vwgt, which was allocated 
earlier, are finally being touched.  Could be the OS is finally assigning vwgt 
actual memory during this time?  I would have thought we would recover more 
memory when the graph is deallocated, which happens just before the call to 
PartGraphRecursive (you can see a slight dip there)...

I'll have to try and instrument it a bit more carefully tomorrow.

--
John
------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to