On Fri, 20 Jul 2012, Kirk, Benjamin (JSC-EG311) wrote:

> On 7/20/12 3:17 PM, "Roy Stogner" <royst...@ices.utexas.edu> wrote:
>
>>> Why not print a trace to the screen when encountering an error and
>>> running serially, but failing back to the current trace files
>>> behavior when running in parallel?
>>
>> I can't *believe* I didn't think of that.
>
> Don't be so hard on yourself - based on my memory both you and John should
> still be asymptoting back to your pre-offspring sensibilities.

What's weird is that I don't feel as tired this time as I did when my
eldest was a month and a half old and I was afraid of falling asleep
on the commute to work... but if I actually count up all the stupid
things I've done in just the past couple weeks, it's significantly
worse than before; I'm forced to conclude that my intelligence has
fallen so low that it's not even competent at self-estimation anymore.

> Agree that's a great solution.

And much easier and safer than trying to make OStreamProxy MPI-aware.
I don't think there'd be any *real* obstacles making that solution
thread-safe and properly memory-managed, but there would certainly be
places we could slip up.

On the subject of sleep-deprivation-stupidity and tricky memory
management: if anyone wants to code review the Parallel:: tricks I
added a couple weeks ago, I would appreciate it.

Parallel::Request::add_post_wait_work() is for allowing arbitrary
callback functors to be attached to a Request object so that the
wait() can do things like cleaning up temporary buffers after an
asynchronous send.  We had a nasty race condition there before.

For the other new code (which is also now on a critical path:
MeshInput of a serial format in a parallel run) search parallel.h and
mesh_communication.C for "packed_range", and see the specializations
in packed_node.C and packed_elem.C and the utility class in
mesh_inserter_iterator.h.  Basically this let us shave mesh broadcasts
down to 50 lines with sexy code like:

   for (unsigned int l=0; l != n_levels; ++l)
     Parallel::broadcast_packed_range(&mesh,
                                      mesh.level_elements_begin(l),
                                      mesh.level_elements_end(l),
                                      &mesh,
                                      mesh_inserter_iterator<Elem>(mesh));

which handles communicating everything from boundary conditions to
neighbor topology to element DoF indices, using generic code that
ought to be easily extensible to other variable-size data types too.
I left PackedNode and PackedElem around for backwards compatibility,
but they could be eliminated if/when the Idaho folks want to use
packed_range based code instead.

In hindsight I should have posted such a significant patch to
libmesh-devel before committing, despite it passing my tests.  It was
originally only going to be affecting distributed ParallelMesh code
paths, and after realizing that it could greatly simplify SerialMesh
communication too I forgot to reconsider the question of whether or
not I should get additional eyes on it.
---
Roy

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to