On Fri, Nov 16, 2012 at 09:38:55AM +0100, telenn barz wrote: > On Fri, Nov 16, 2012 at 1:13 AM, David Gibson > <da...@gibson.dropbear.id.au>wrote: > > > On Thu, Nov 15, 2012 at 04:31:44PM -0500, Eric B Munson wrote: > > > On 2012-11-15 14:44, telenn barz wrote: > > > > On Thu, Nov 15, 2012 at 3:34 PM, Eric B Munson <emun...@mgebm.net > > > > [1]> > > > > wrote: > > > > > > > >> On 2012-11-15 03:42, telenn barz wrote: > > > >> > > > >>> Making some tests with libhugetlbfs on kernel 2.6.34, backing > > > >>> automatically the heap, we realized that the fallback mechanism > > > >>> is > > > >>> all-or-nothing : if theres not enough hugepages, the whole asked > > > >>> > > > >>> memory is backed by base pages. > > > >>> > > > >>> We expected libhugetlbfs to mix hugepages / basepages in such a > > > >>> case. > > > >>> Would there be any issue with this requirement ? > > > > > > > >>> Thanks, > > > >>> Telenn > > > >> > > > >> Telenn, > > > >> > > > >> This may be possible on x86/x86_64 but it isnt possible on POWER > > > >> due to memory segment page size restrictions. To avoid having arch > > > >> dependent morecore hooks we have not attempted this. > > > > > > > > Page size restrictions on POWER ? Could you please tell more about > > > > this - in a few words ? > > > > > > > > > Sure, PPC view memory in segments, for memory below 4GB these segments > > > are 256MB, and I don't recall the segment size over 4GB. > > > > 1TB. From a hardware PoV, the OS can use 256MB or 1TB segments as it > > chooses (as long as they're naturally aligned). But Linux (at > > present) always uses 256MB segments below 4GB and 1TB segments above. > > > > This segment notion, wouldn't it be for hash page table based > powerpc only ?
That's correct, at least as far as the hardware goes. > What about Freescale e500 family cores (bookE) ? (I don't see any segment > notion for them - maybe I missed something) No, the BookE and other embedded chips don't have any segments in hardware. However, I think we still have the segment-based address space restrictions in the kernel because having them condition on the low-level CPU type would actually be quite messy. That may have changed more recently, I haven't looked at the code for some time. > > > Inside a > > > segment the page size must be uniform. So if morecore wanted to handle > > > your version of fallback it would also have to know where it can place > > > huge pages still. > > > > In addition I don't think the glibc morecore callback lets us control > > the fallback that closely. IIRC, we just fail the morecore, and glibc > > falls back on an mmap(). So even on x86 to do partial fallback we'd > > need a different (and much more complex) implementation of hugepage > > malloc() backing than just intercepting the morecore hook. > > > > Are you talking about a full re-implementation of malloc within > libhugetlbfs ? Quite possibly. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ------------------------------------------------------------------------------ Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel