On Dec 13, 2010, at 4:22 PM, David Singleton wrote: > I didnt see memory binding in their explicitly.
You're correct; sorry, I was just referring to some general slides that showed some of the ideas that we're working on for next-generation affinity stuff. But memory binding will be included as well. >> What OS and libnuma version are you running? It has been my experience that >> libnuma can lie on RHEL 5 and earlier. My (possibly flawed) understanding >> is that this is because of lack of proper kernel support; such "proper" >> kernel support was only added fairly recently (2.6.30something). > > That's interesting. By "lie", do you mean processes are not really memory > bound? I mean that even when usinga strict memory binding policy, if you numa_alloc* on node X, you can get memory on node Y. > We're running 2.6.27.55 (and numactl 0.9.8-11.el5) and I've done quite a bit > of > testing that always looks correct. That could well be. On RHEL 5 (2.6.18 and numactl-0.9.8), the above "bad" behavior happens. With RHEL 6 (2.6.32 and numactl-2.0.3), it seems to be correct. Where exactly the issue was fixed, I'm not entirely sure. >> That aside, it's somewhat disappointing that MPOL_PREFERRED is not working >> well and that you had to switch to MPOL_BIND. :-( > > I'm not sure its disappointing - I think it's just to be expected. For sites > that > drop caches or run a whole node memhog or reboot nodes between jobs, > MPOL_PREFERRED > will do the right thing. For sites that are not so careful or use > suspend/resume > scheduling, memory overcommits and some amount of page reclaim or paging on > job > startup will happen occasionally. Paying the extra cost of making sure that > page > reclaim or paging results in ideal locality is definitely a big win for a job > overall. (Paging suspended jobs back in after they are resumed can undo some > of > their ideal placement but that can be handled.) Fair enough. >> Should we add an MCA parameter to switch between BIND and PREFERRED, and >> perhaps default to BIND? > > I'm not sure BIND should be the default for everyone - memory imbalanced jobs > might > page badly in this case. But, yes, we would like an MCA to choose and allow > sites > to select BIND as their default if they wish. An mpirun option like > --bind-to-mem > would need a preferred/affinity alternative and I'm not sure how of a nice > notation/ > syntax for that. How about: --mca maffinity_libnuma_policy bind|preferred I can do that for the v1.5 series, if you'd like. I can't really do it for v1.4 because that series is in "bug fix only" mode. However, given that we're revamping all of our affinity support, I don't know what the future interface will look like -- so the name may change, or ... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/