On 10.02.2007 [17:41:04 -0600], Bill Buros wrote: > > > > >> eh? As a user, would does this do? It lets me skip the linking > >step? > > > >Sort of. So, if you're a user and you have a binary with a large BSS, > >and maybe don't know if you want to try and relink the binary, but would > >like to see what is possible, performance-wise, if you were to relink > >the binary, this give a rough approximation to that. > > > >Let's say for some 64-bit power binary, the BSS is 300M, from F400000 > >(244M) to 22000000 (544M) in the address space this code will do the > >following: > > > >Leave the portion of the BSS from 244M-256M in small pages > >Attempt to put the entire portion of the BSS from 256M to 512M in huge > >pages > >Leave the portion of the BSS from 512M-544M in small pages > > > now that's clever. so that would allow a user to use a subset of the > large pages in an executable without having to reserve ALL of the pages.
Right, it would of course depend on how the system is configured (# of hugepages available) and how the binary is compiled originally (where the BSS is in memory, how big it is, etc). There is the issue that in the current code, we will fault in all 256M of the region in, since we don't know a la minimal_copy what is uninitialized and what is not. This is of course very raw code and only a first draft, and there may be something we can do about this. > All without recompiling. Ideally :) > > > >Now, I say *attempt* because as we know there are any number of reasons > >for this to fail (insufficient huge pages, ...). > > > >In thinking about this, we may want to extend this functionality so that > >a user can say -- don't truncate the top of the possibly huge page area, > >go ahead and put as much in huge pages as possible. The reason this > >would matter is if in the previous example, rather than 300M BSS, we had > >a 257M BSS. Since the BSS doesn't start *exactly* at a 256M region > >boundary, we need to round up to find where we can start mapping in huge > >pages, so that leaves us with 235M of BSS that can go in huge pages. But > >rather than waste 21M of virtual addresses (the remaining part of the > >next 256M region, if we were to successfully map in the huge pages), we > >will see that the remaining BSS space is not large enough to cover the > >full region and fail to do anything. > > > >But if the user is requesting that we try and use hugepages for a > >non-relinked binary, maybe it should be best effort. Or, at a sheer > >minimum use up one region regardless of cost, but only use full regions > >after that? > > > yeah.. different variations could be possible then. In a "try me" mode, > it'll > be good to have a set of messages coming from libhuge to provide guidance > on what's happening... Yep, exactly. And that's part of why I think having the "best effort" heuristic be the default may make sense. That way, we won't require that the user have 256M worth of huge pages available on their system, but will still work with fewer (presuming the BSS is small enough). The try-me mode could have the following meanings: 0 - don't do anything beyond what the linker script says, which for non-relinked binaries is the default (no PRELOAD) behavior and for relinked binaries should also be the default behavior (ELFMAP=y) 1 - best effort, so put as much in huge pages as possible, sacrificing hugetlbfs_vaddr_granularity() chunks of the address space at a time 2 - try, but only use up hugetlbfs_vaddr_granularity() chunks of the address space if you can fill them. This could be useful if, for instance, an application runs out of address space with 1. Thanks, Nish -- Nishanth Aravamudan <[EMAIL PROTECTED]> IBM Linux Technology Center ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Libhugetlbfs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
