On Fri, Nov 16, 2012 at 09:38:55AM +0100, telenn barz wrote:
> On Fri, Nov 16, 2012 at 1:13 AM, David Gibson
> <da...@gibson.dropbear.id.au>wrote:
> 
> > On Thu, Nov 15, 2012 at 04:31:44PM -0500, Eric B Munson wrote:
> > > On 2012-11-15 14:44, telenn barz wrote:
> > > > On Thu, Nov 15, 2012 at 3:34 PM, Eric B Munson <emun...@mgebm.net
> > > > [1]>
> > > > wrote:
> > > >
> > > >> On 2012-11-15 03:42, telenn barz wrote:
> > > >>
> > > >>> Making some tests with libhugetlbfs on kernel 2.6.34, backing
> > > >>> automatically the heap, we realized that the fallback mechanism
> > > >>> is
> > > >>> all-or-nothing : if theres not enough hugepages, the whole asked
> > > >>>
> > > >>> memory is backed by base pages.
> > > >>>
> > > >>> We expected libhugetlbfs to mix hugepages / basepages in such a
> > > >>> case.
> > > >>> Would there be any issue with this requirement ?
> > > >
> > > >>> Thanks,
> > > >>> Telenn
> > > >>
> > > >> Telenn,
> > > >>
> > > >> This may be possible on x86/x86_64 but it isnt possible on POWER
> > > >> due to memory segment page size restrictions.  To avoid having arch
> > > >> dependent morecore hooks we have not attempted this.
> > > >
> > > >  Page size restrictions on POWER ? Could you please tell more about
> > > > this - in a few words ?
> > >
> > >
> > > Sure, PPC view memory in segments, for memory below 4GB these segments
> > > are 256MB, and I don't recall the segment size over 4GB.
> >
> > 1TB.  From a hardware PoV, the OS can use 256MB or 1TB segments as it
> > chooses (as long as they're naturally aligned).  But Linux (at
> > present) always uses 256MB segments below 4GB and 1TB segments above.
> >
> 
> This segment notion, wouldn't it be for hash page table based
> powerpc only ?

That's correct, at least as far as the hardware goes.

> What about Freescale e500 family cores (bookE) ? (I don't see any segment
> notion for them - maybe I missed something)

No, the BookE and other embedded chips don't have any segments in
hardware.  However, I think we still have the segment-based address
space restrictions in the kernel because having them condition on the
low-level CPU type would actually be quite messy.  That may have
changed more recently, I haven't looked at the code for some time.

> > >  Inside a
> > > segment the page size must be uniform.  So if morecore wanted to handle
> > > your version of fallback it would also have to know where it can place
> > > huge pages still.
> >
> > In addition I don't think the glibc morecore callback lets us control
> > the fallback that closely.  IIRC, we just fail the morecore, and glibc
> > falls back on an mmap().  So even on x86 to do partial fallback we'd
> > need a different (and much more complex) implementation of hugepage
> > malloc() backing than just intercepting the morecore hook.
> >
> > Are you talking about a full re-implementation of malloc within
> libhugetlbfs ?

Quite possibly.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to