On Fri, 9 Mar 2012, Jeffrey Squyres wrote:

On Mar 9, 2012, at 1:32 PM, Nathan Hjelm wrote:

An mpool that is aware of local processes lru's will solve the problem in most 
cases (all that I have seen)

I agree -- don't let words in my emails make you think otherwise.  I think this will fix 
"most" problems, but undoubtedly, some will still occur.

What's your timeline for having this ready -- should it go to 1.5.5, or 1.6?

More specifically: if it's immanent, and can go to v1.5, then the openib 
message is irrelevant and should not be used (and backed out of the trunk).  If 
it's going to take a little bit, I'm ok leaving the message in v1.5.5 for now.

I wrote the prototype yesterday (after finding that limiting the lru doesn't 
work for uGNI-- @256 pes we could only register ~1400 item instead of the 3600 
max we saw @128). I should have a version ready for review next week and a 
final version by the end of the month.


BTW, can anyone tell me why each mpool defines mca_mpool_base_resources_t 
instead of defining mca_mpool_blah_resources_t. The current design makes it 
impossible to support more than one mpool in a btl. I can delete a bunch of 
code if I can make a btl fall back on the rdma mpool if leave_pinned is not set.

-Nathan

Reply via email to