On Fri, Aug 29, 2008 at 09:51:35AM -0500, Adam Litke wrote: > On Fri, 2008-08-29 at 15:42 +1000, David Gibson wrote: > > On Wed, Aug 27, 2008 at 06:34:41PM +0000, Adam Litke wrote: [snip] > > Hrm. Something about the structure of all this bothers me, but I'm > > going to have to think some more on how I think it should be done. It > > seems to me like this draft has too much of a dichotomy between the > > default / non-default pagesize. > > > > I'd envisage instead, something where the available mountpoints and > > pagesizes can be queried. The functions for explicitly allocating > > hugepages (unlinked_fd() and so forth) would have new versions which > > take an explicit pagesize / mountpoint (not sure which). Obviously > > the ones that just use a default pagesize would be kept too, for > > compatibility but they'd just be a wrapper around the more general > > version. Possibly a function to change the default pagesize (from > > amongst the available ones) at runtime too. Like I say, need to > > think about this some more. > > All of the features you suggest can be easily added by an already > planned follow-on patch series. For example, the default size could be > changed through specification of an environment variable. Adding the > explicit page size selection functions (for unlinked_fd, et al) is also > trivial.
Ok, I'm glad to hear that. > The complexity you refer to as "a dichotomy between the default / > non-default pagesize" has been specifically designed. To be compatible > with older kernels (those with only one page size) the default page size > must be handled in a compatibility mode. For example, the counters need > to be read from /proc/meminfo because they may not be available in > sysfs. Also, to preserve compatibility with any applications that > aren't accustomed to specifying a page size, we must ensure that the > default page size in libhugetlbfs is also the kernel default size. > Unfortunately choosing the default size isn't as simple as querying > meminfo for the system, default page size. If the user hasn't mounted a > filesystem with a size matching the meminfo size, we must choose a > default arbitrarily. Oh, certainly, I have no problem with the selection of a default size and mountpoint in this manner. Clearly we need to do this for compatibility. It just seems like there places that aren't locked in by compatibility where the fundamental option seems to be default/non-default rather than pagesize directly which would seem to make more sense. > Some of this "default size" stuff will get worked out of the code more > as I add functions for requesting specific sizes ie. > hugetlbfs_unlinked_fd(). The current plan is to have a separate Ok, that sounds good. > library/executable (akin to hugectl) that will be used to query page > sizes, mount points, and pool counter values. Since we have adopted the Ok, but presumably there will also be library interface so that programs can query this for themself? > limitation that only one mountpoint can exist per page size, I feel it > is much more user-friendly to use the page size as a handle rather than > the mount point. The page size is more important to the user than a > filesystem mount point that we are actually trying to abstract away from > them. Yes, that's true in general. But are there cases where multiple mountpoints of the same page size aren't interchangable? The mountpoints will have separate quota allocation, won't they? And they could have different permissions. > Your questions reflect the fact that I have neglected to detail my plan > to complete the multiple page size support. Hopefully my responses > above have helped in that regard. I feel as if we are basically on the > same page design-wise. I have been thinking about this interface since > June and I am confident that it is fundamentally designed to solve the > intricacies of multiple page sizes _and_ backwards compatibility as > logically and as simply as is possible. I have no doubt that > improvements are possible to my implementation, but I don't think a > redesign is necessary. Ok. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel