On 11/01/2012 03:33 PM, Roy Stogner wrote: > > On Thu, 1 Nov 2012, David Knezevic wrote: > >> I just checked in a change, I believe this is fixed now --- at least >> reduced_basis_ex6 works with --enable-parmesh for me now in r6280. >> >> But I'd appreciate if you could have a look at the diff. > > That was fast work! > > I'm not sure it's correct, though. What happens when a function > evaluation point falls on a ghost element? Both the owning processor > and its neighbor will be able to evaluate it, they'll both think that > they're the root_id, the sum(root_id) will give you something > completely different, and then the broadcast will send out an > uninitialized Number. So instead of a segfault or an assertion > failure you get an incorrect output, arguably a *much worse* problem!
Yes, good point! > >> In RBEIMConstruction::evaluate_mesh_function I use the mesh function >> on a ghosted vector now, and broadcast the result. Incidentally, is >> there a better way to get the root_id for the broadcast? (I used >> Parallel::sum with libMesh::processor_id()) > > Well, using max() instead of sum() ought to correct the failure case I > described above. It won't be more efficient, though. > > I don't think there's a more efficient way to get the root_id > for a single broadcast, but if you're worried about the efficiency > then the thing to do is batch your broadcasts. When you're sending > small enough messages that MPI is latency limited (and one Number per > message is definitely in this regime) you can pack a lot more data > into the send "for free". So you could have your > evaluate_mesh_function test for ghost elements and return zero there, > stuff all the local evaluate_mesh_function results into a single > vector, and Parallel::sum the vector. Or you could loop over sending > processors, and have each one broadcast a sparse-format version of its > non-zero chunk of the vector in turn. I'm not sure which of those two > options would be faster; it probably depends on how many processors > you're on and how big your vector is. This is all very good stuff. But I'm not too worried about this efficiency bottleneck for now, so I'll just change sum() to max(). I'll keep these optimizations in mind for later though, thanks! David ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Libmesh-devel mailing list Libmesh-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-devel