On Sep 3, 2010, at 12:16 AM, Ralph Castain wrote: > Backing off the polling rate requires more application-specific logic like > that offered below, so it is a little difficult for us to implement at the > MPI library level. Not saying we eventually won't - just not sure anyone > quite knows how to do so in a generalized form.
FWIW, we've *talked* about this kind of stuff among the developers -- it's at least somewhat similar to the "backoff to blocking communications instead of polling communications" issues. That work in particular has been discussed for a long time but never implemented. Are your jobs hanging because of deadlock (i.e., application error), or infrastructure error? If they're hanging because of deadlock, there are some PMPI-based tools that might be able to help. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/