On Dec 13, 2010, at 4:22 PM, David Singleton wrote:

> I didnt see memory binding in their explicitly.

You're correct; sorry, I was just referring to some general slides that showed 
some of the ideas that we're working on for next-generation affinity stuff.  
But memory binding will be included as well.

>> What OS and libnuma version are you running?  It has been my experience that 
>> libnuma can lie on RHEL 5 and earlier.  My (possibly flawed) understanding 
>> is that this is because of lack of proper kernel support; such "proper" 
>> kernel support was only added fairly recently (2.6.30something).
> 
> That's interesting.  By "lie", do you mean processes are not really memory 
> bound?

I mean that even when usinga strict memory binding policy, if you numa_alloc* 
on node X, you can get memory on node Y.

> We're running 2.6.27.55 (and numactl 0.9.8-11.el5) and I've done quite a bit 
> of
> testing that always looks correct.

That could well be.

On RHEL 5 (2.6.18 and numactl-0.9.8), the above "bad" behavior happens.  With 
RHEL 6 (2.6.32 and numactl-2.0.3), it seems to be correct.  Where exactly the 
issue was fixed, I'm not entirely sure.

>> That aside, it's somewhat disappointing that MPOL_PREFERRED is not working 
>> well and that you had to switch to MPOL_BIND.  :-(
> 
> I'm not sure its disappointing - I think it's just to be expected.  For sites 
> that
> drop caches or run a whole node memhog or reboot nodes between jobs, 
> MPOL_PREFERRED
> will do the right thing.  For sites that are not so careful or use 
> suspend/resume
> scheduling, memory overcommits and some amount of page reclaim or paging on 
> job
> startup will happen occasionally.  Paying the extra cost of making sure that 
> page
> reclaim or paging results in ideal locality is definitely a big win for a job
> overall.  (Paging suspended jobs back in after they are resumed can undo some 
> of
> their ideal placement but that can be handled.)

Fair enough.

>> Should we add an MCA parameter to switch between BIND and PREFERRED, and 
>> perhaps default to BIND?
> 
> I'm not sure BIND should be the default for everyone - memory imbalanced jobs 
> might
> page badly in this case.  But, yes, we would like an MCA to choose and allow 
> sites
> to select BIND as their default if they wish.  An mpirun option like 
> --bind-to-mem
> would need a preferred/affinity alternative and I'm not sure how of a nice 
> notation/
> syntax for that.

How about:

  --mca maffinity_libnuma_policy bind|preferred

I can do that for the v1.5 series, if you'd like.  I can't really do it for 
v1.4 because that series is in "bug fix only" mode.  However, given that we're 
revamping all of our affinity support, I don't know what the future interface 
will look like -- so the name may change, or ...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to