On Thu, Nov 06, 2008 at 12:06:23PM +1100, David Gibson wrote: > On Thu, Nov 06, 2008 at 12:51:54AM +0000, Mel Gorman wrote: > > On Thu, Nov 06, 2008 at 10:15:54AM +1100, David Gibson wrote: > > > On Wed, Nov 05, 2008 at 09:43:40AM -0600, Adam Litke wrote: > > > > On Wed, 2008-11-05 at 14:02 +0000, Mel Gorman wrote: > > > > > @@ -127,6 +140,34 @@ void *get_huge_pages(size_t len, ghp_t flags) > > > > > return NULL; > > > > > } > > > > > > > > > > +offset: > > > > > > > > This cache-coloring code (below) is nice. Could we move it into it's > > > > own function though? I like the idea of caching the sysconf value, but > > > > perhaps you could do that as a static variable within the new > > > > function. > > > > > > Yeah, I don't think this belongs in get_huge_pages(). > > > get_huge_pages() is supposed to be the "raw" low-level interface to > > > just grab some hugepages. Offsetting within the page should be up to > > > the caller, if it cares. > > > > > > > Lets take an average programmer who is tasked with converting the > > customer allocator used in OpenMPI[1] to use large pages. He's very > > unlikely to take the cache alignment into account and instead be > > faced by weird slowdowns after porting the code. I don't think > > they'll dig through documentation, oprofile or anything else looking > > for the answer. They'll just assume hugepages are stupid and move > > on. Even if they know about cache line coloring, they'll assume the > > allocator is already doing it. > > > > A caller that is really looking for X number of hugepages is going to align > > the length resulting in no wastage and no colouring. If for some reason > > they are not aligning their length but need the start hugepage-aligned, > > they would use GHP_ALIGN. > > No, I still don't buy it - making assumptions about what the program's > intending based on details of the parameters is just dangerous > territory, even if you're right most of the time. i
Thing is, functions like malloc already do things already. In the kernel it's vaguely similar. kmalloc() for example colours on behalf of the caller because it's the right thing to do where possible. > The basic problem > here is that you're making get_huge_pages() do two different things: > first act as a low-level "grab some huge pages" function, and second > act as a general handy "grab some memory that's in hugepages". This > extension is appropriate for the second, but not the first. > I view it more as the latter - i.e. it's a function for grabbing memory that is in hugepages. Certainly when I was doing a straight-forward converting of STREAM to get_huge_pages(), I got caught by this as the impact of not colouring was so severe. Look at this http://www.csn.ul.ie/~mel/comparison-align-Add.png 1-get_huge_pages() calls get_huge_pages() once and places the 3 arrays used by STREAM in it. 3-get_huge_page-coloured calls get_huge_pages() once for each array - which is what a programmer is most likely to do and what I started with. Note how they perform similarly because this has my patch applied and uses cache coloring. 3-get_huge_page-aligned uses GHP_ALIGN and behaves as get_huge_pages() in 2.1-pre4 currently does. It's performance *blows* and it took me a few hours to pin the problem on lack of colouring because I'm used of allocators handling that sort of issue on my behalf. Instead I chased down blind alleys. > Having the second is a reasonable thing - although to a certain extent > just using malloc() with HUGEPAGE_MORECORE enabled accomplishes this - Ordinarily yes but there is at least one case where a custom allocator is involved and using HUGEPAGE_MORECORE does nothing. get_huge_pages() was meant to help those guys port. > but I think it should be separate from the first. I just don't see > any clear line between this sort of extension and a full blown > hugepage-backed malloc()-style allocator. Again, not a bad idea in > principle, but not the purpose of get_huge_pages(). > I've no interest in writing a hugepage-backed malloc()-style allocator. There are plenty of people who write allocators already. Cache colouring is simple and if the caller is passing in stupid lengths, we might as well help them out instead of throwing them curve balls with cache issues. > > My guess is that many callers will be relatively naive in terms of hugepage, > > will not align their length and will not care about the alignment of the > > start of the buffer. In that case defaulting to colouring is the path of > > least surprise for an allocator API. > > Sure, so provide an API for this, but leave get_huge_pages() as is. It feels like function bloat when the flags parameter is there to get_huge_pages() for purposes just like this. I'm actually going to be surprised if needs GHP_ALIGN. A caller that knows it needs hugepage-aligned memory is also going to know they should pass in a hugepage-aligned length to avoid wastage. What would you suggest as an interface and why it is better than get_huge_pages() doing something sensible by default and providing GHP_ALIGN to avoid the ugliness that is malloc() vs posix_memalign()? > As you say this is "least surprise" *for a function that looks like a > more-or-less general allocator*. It's emphatically *not* least > surprise (nor "minimum magic") for something that looks like it's a > "raw" interface to the kernel facility. I *wrote* the bloody thing and still got caught by the default aligning to the same boundary. Cache colouring is easy to implement but also very easy to forget why it is important. I'd prefer it do something that performs well if it can instead of deliberatly sucking. If I forget about cache colouring, it also means I didn't care what the alignment of the pointer I got back was as long as it points to valid memory. > That was the original intent > of get_huge_pages(), and I think it's still what the name suggests it > is. > I regret that name now. > Let's not ram these two rather different interface styles into a > single function entry-point. If we do start writing a general > hugepage backed allocator, ideally it will use the same code paths > that malloc() would go through when automatic > malloc()-hugepage-backing is enabled. > We just want to provide an interface suitable for people that writing allocators or are porting parts of their application to use hugepages where it is sensible. The former should be already coding to not waste any memory and the latter should be helped as much as possible by defaulting to behaviour whose performance does not suck. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel