We are trying to improve the reliability of the libhugetlbfs malloc function by prefaulting the huge pages at allocation time. If we do this, we can guarantee (for apps which do not fork) that all of the pages are available and fall back to normal pages if enough huge aren't available.
The problem with prefaulting (via mlock) is that huge pages are preferentially taken from the numa node which called malloc, not the nodes where the memory is first accessed. This leads to serious performance regressions which make it slower than using normal pages. The solution, for these applications, is to use the NUMA api to set bind/interleave policies that make sense for the application in question. My question for you is: What tuning scenarios would you require for an environment variable that controls this behavior? The two options we mulled over: - Do nothing -- rely on previously set numa policy (numactl) (The above will generally exhaust all node-local huge pages first, then move on to other nodes) - Interleave on all nodes Would these be enough to cover the cases you see when running various workloads on numa systems? Which option should be the default? My gut feel is we should take option 1 above for the default. This will handle single-threaded apps that don't bounce around multiple numa nodes. That still leaves one issue -- Applications that move among numa nodes unpredictably after starting up will always perform worse under this new algorithm than before, since demand faulting allows the memory to be instantiated node-local wherever the process happens to be running. Ok, I am beginning to ramble. I'll leave it here for now. Hope this all makes sense --and if not, let me know. -- Adam Litke - (agl at us.ibm.com) IBM Linux Technology Center ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel