On Sep 3, 2010, at 12:16 AM, Ralph Castain wrote:

> Backing off the polling rate requires more application-specific logic like 
> that offered below, so it is a little difficult for us to implement at the 
> MPI library level. Not saying we eventually won't - just not sure anyone 
> quite knows how to do so in a generalized form.

FWIW, we've *talked* about this kind of stuff among the developers -- it's at 
least somewhat similar to the "backoff to blocking communications instead of 
polling communications" issues.  That work in particular has been discussed for 
a long time but never implemented.

Are your jobs hanging because of deadlock (i.e., application error), or 
infrastructure error?  If they're hanging because of deadlock, there are some 
PMPI-based tools that might be able to help.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to