>> We already serialize and fall back on Metis when we have more elements
>> than processors, and according to our code comments N_e<N_p is simply
>> an unsupported configuration for Parmetis... but I'm seeing crashes
>> (segfaults and/or double-frees) in quite a few other cases.  Running
>> adaptivity_ex1 with 1 through 20 elements on 1 through 15 processors
>> gives failures whenever num_proc: num_elem is in
>> 
>> 3:3
>> 4:4
>> 5:5,6,7,8
>> 6:6,7,8
>> 7:7,8,9
>> 8:8,9,10,11
>> 9:9,10,11,12
> ...
> 
> Sadly, we seem to be getting the very same failure pattern with the
> current ParMETIS.


A telling comment from G.K. circa 2004:

"How large is the graph? ParMetis seems to have problems with small graphs"

http://www-users.cs.umn.edu/~karypis/.discus/messages/16/36.html

While that didn't end up being the cause for the particular problem in that
thread, I also came across some known issues reported by the Zoltan
developers:

http://www.tddft.org/trac/octopus/browser/trunk/external_libs/zoltan/Known_P
roblems?rev=7107

which suggests metis and parmetis are both fragile when any partition winds
up with 0 objects.

A more comprehensive workaround, but still a hackish workaround, may be to
fall back to metis or even something else when NE/NP < 10 or something
reasonable where we would expect to avoid this 0 object per partition
problem.

-Ben


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to