The doc is wrong, flags are used, only for BY_NODESET. I actually fixed that in git very recently.
Brice Le 13/11/2017 07:24, Biddiscombe, John A. a écrit : > In the documentation for get_area_memlocation it says > "If HWLOC_MEMBIND_BYNODESET is specified, set is considered a nodeset. > Otherwise it's a cpuset." > > but it also says "Flags are currently unused." > > so where should the BY_NODESET policy be used? Does it have to be used with > the original alloc call? > > thanks > > JB > > ________________________________________ > From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of > Biddiscombe, John A. [biddi...@cscs.ch] > Sent: 13 November 2017 14:59 > To: Hardware locality user list > Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset > > Brice > > aha. thanks. I knew I'd seen a function for that, but couldn't remember what > it was. > > Cheers > > JB > ________________________________________ > From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of Brice > Goglin [brice.gog...@inria.fr] > Sent: 13 November 2017 14:57 > To: Hardware locality user list > Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset > > Use get_area_memlocation() > > membind() returns where the pages are *allowed* to go (anywhere) > memlocation() returns where the pages are actually allocated. > > Brice > > > > > Le 13/11/2017 06:52, Biddiscombe, John A. a écrit : >> Thank you to you both. >> >> I modified the allocator to allocate one large block using hwloc_alloc and >> then use one thread per numa domain to touch each page according to the >> tiling pattern - unfortunately, I hadn't appreciated that now >> hwloc_get_area_membind_nodeset >> always returns the full machine numa mask - and not the numa domain that the >> page was touched by (I guess it only gives the expected answer when >> set_area_membind is used first) >> >> I had hoped to use a dynamic query of the pages (using the first one of a >> given tile) to schedule each task that operates on a given tile to run on >> the numa node that touched it. >> >> I can work around this by using a matrix offset calculation to get the numa >> node, but if there's a way of querying the page directly - then please let >> me know. >> >> Thanks >> >> JB >> ________________________________________ >> From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of >> Samuel Thibault [samuel.thiba...@inria.fr] >> Sent: 12 November 2017 10:48 >> To: Hardware locality user list >> Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset >> >> Brice Goglin, on dim. 12 nov. 2017 05:19:37 +0100, wrote: >>> That's likely what's happening. Each set_area() may be creating a new >>> "virtual >>> memory area". The kernel tries to merge them with neighbors if they go to >>> the >>> same NUMA node. Otherwise it creates a new VMA. >> Mmmm, that sucks. Ideally we'd have a way to ask the kernel not to >> strictly bind the memory, but just to allocate on a given memory >> node, and just hope that the allocation will not go away (e.g. due to >> swapping), which thus doesn't need a VMA to record the information. As >> you describe below, first-touch achieves that but it's not necessarily >> so convenient. >> >>> I can't find the exact limit but it's something like 64k so I guess >>> you're exhausting that. >> It's sysctl vm.max_map_count >> >>> Question 2 : Is there a better way of achieving the result I'm looking >>> for >>> (such as a call to membind with a stride of some kind to say put N >>> pages in >>> a row on each domain in alternation). >>> >>> >>> Unfortunately, the interleave policy doesn't have a stride argument. It's >>> one >>> page on node 0, one page on node 1, etc. >>> >>> The only idea I have is to use the first-touch policy: Make sure your buffer >>> isn't is physical memory yet, and have a thread on node 0 read the "0" >>> pages, >>> and another thread on node 1 read the "1" page. >> Or "next-touch" if that was to ever get merged into mainline Linux :) >> >> Samuel >> _______________________________________________ >> hwloc-users mailing list >> hwloc-users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/hwloc-users >> _______________________________________________ >> hwloc-users mailing list >> hwloc-users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/hwloc-users > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/hwloc-users > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/hwloc-users > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/hwloc-users _______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users