>> We already serialize and fall back on Metis when we have more elements >> than processors, and according to our code comments N_e<N_p is simply >> an unsupported configuration for Parmetis... but I'm seeing crashes >> (segfaults and/or double-frees) in quite a few other cases. Running >> adaptivity_ex1 with 1 through 20 elements on 1 through 15 processors >> gives failures whenever num_proc: num_elem is in >> >> 3:3 >> 4:4 >> 5:5,6,7,8 >> 6:6,7,8 >> 7:7,8,9 >> 8:8,9,10,11 >> 9:9,10,11,12 > ... > > Sadly, we seem to be getting the very same failure pattern with the > current ParMETIS.
A telling comment from G.K. circa 2004: "How large is the graph? ParMetis seems to have problems with small graphs" http://www-users.cs.umn.edu/~karypis/.discus/messages/16/36.html While that didn't end up being the cause for the particular problem in that thread, I also came across some known issues reported by the Zoltan developers: http://www.tddft.org/trac/octopus/browser/trunk/external_libs/zoltan/Known_P roblems?rev=7107 which suggests metis and parmetis are both fragile when any partition winds up with 0 objects. A more comprehensive workaround, but still a hackish workaround, may be to fall back to metis or even something else when NE/NP < 10 or something reasonable where we would expect to avoid this 0 object per partition problem. -Ben ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Libmesh-devel mailing list Libmesh-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-devel