On Fri, 22 Oct 2010, John Peterson wrote:
On Fri, Oct 22, 2010 at 2:46 AM, Tim Kroeger
<[email protected]> wrote:
On Thu, 21 Oct 2010, Roy Stogner wrote:
It looks like there's a deadlock in here in parallel? You're doing
blocking sends followed by blocking receives? Probably need to switch
those (at least the sends, but optimally both) to non-blocking.
Good point. It worked for me, but aparently I've just been lucky.
I didn't find any place in the library where non-blocking communication is
used so far, so could you please double-check that I did it right now?
It's difficult to search for them since we've named them both "send"
in our C++ MPI interface, but there is an example of a non-blocking
send in mesh_communication.C, line 333.
I see, I was searching for "nonblocking_send", which is also defined
in the library but obviously not used. I didn't observe that there is
also a non-blocking variant that is just named "send".
(BTW: Sorry for the delay; a lot of different work popped up in the
meantime...)
I don't think that non-blocking receives would gain much performance here.
One paradigm is to post all the non-blocking receives before doing all
of the non-blocking sends. I guess the idea that all the processors
are then "ready" to receive their message(s), no matter what order the
non-blocking sends are actually completed in.
Good idea, but I see that this isn't done at the point that you
mentioned either, is it? I guess, it's not easy to do this if you're
not knowing in advance how much data you're going to receive.
Also, in the code that you mentioned, another thing happens which I
think is dangerous: That is, you are having a
std::vector<Parallel::Request> node_send_requests;
(and a number of similar vectors), and then, each time you want to
send something, you do
node_send_requests.push_back(Parallel::request());
Parallel::send(...,node_send_requests.back(),...);
While this looks perfectly alright, I noticed that there is a subtle
problem with this (because I tried to do it the same way in my code
and found it not working and traced it back), that is: push_back() may
possibly have to re-allocate memory and copy all elements. In that
case, for all previous requests, the copy constructor will be called,
and then the destructor. But the destructor calls MPI_Request_free(),
which makes the copy become invalid.
The easiest way I can think of to avoid that problem is to fill the
vector with empty requests *before* you actually send something and
then leave the vector length fixed. More sophisticated (but also
cleaner) solutions to this problem are thinkable, though.
Let me know if I missed some reason why this problem should be unable
to occur.
In practice, it
probably doesn't make much difference, but it's always nice to avoid
potential deadlocks by using non-blocking versions of the
communication routines.
Yes, but non-blocking sends should be enough to avoid deadlocks.
I'll leave my code unchanged then (with non-blocking sends and
blocking receives) until I should encounter that this is a
bottle-neck. However, if somebody else wants to improve this, I
wouldn't object of course (once it has been checked in).
Best Regards,
Tim
--
Dr. Tim Kroeger
CeVis -- Center of Complex Systems and Visualization
University of Bremen [email protected]
Universitaetsallee 29 [email protected]
D-28359 Bremen Phone +49-421-218-7710
Germany Fax +49-421-218-4236
------------------------------------------------------------------------------
Achieve Improved Network Security with IP and DNS Reputation.
Defend against bad network traffic, including botnets, malware,
phishing sites, and compromised hosts - saving your company time,
money, and embarrassment. Learn More!
http://p.sf.net/sfu/hpdev2dev-nov
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users