On 05.03.2007 [14:58:45 +1100], David Gibson wrote:
> On Mon, Feb 26, 2007 at 11:13:24AM -0800, Nishanth Aravamudan wrote:
> > We've found that on Power some 32-bit binaries, when relinked via
> > libhugetlbfs, run out of address space. Add some heuristics to warn the
> > user if the memory size of the segments (plus the size of the largest
> > segment) exceeds 80% of addressable memory.
> 
> Sorry, didn't really look at this closely before.
> 
> At what point exactly are we running out of address space?  As we
> actually remap the segments?  Or later, when the program uses malloc()
> or other mappings?  Either way I assume it's the extra buffer space we
> need between hugepage and normal page segments that's causing us to
> run out of address space where we didn't before.

Right, that's the correct assumption. The padding needed on certain
archs to satisfy hugepage allocation, in *addition* to significant
hugepage consumption.

> I'm just a little uneasy about heuristic thresholds like this, so I'd
> like to understand the situation better to see if we can have a more
> precise warning condition.

Yes, this is purely a heuristc, so if it's wrong, it's wrong. Really,
the idea is -- run a 32-bit app. If it fails, then re-run with DEBUG on
and see if maybe the library notes that the segments are rather large
for a 32-bit binary and the address space may be exhausted.

> > diff --git a/elflink.c b/elflink.c
> > index a53c649..bed8ef6 100644
> > --- a/elflink.c
> > +++ b/elflink.c
> > @@ -55,6 +55,9 @@
> >  #define ELF_ST_TYPE(x)  ELF64_ST_TYPE(x)
> >  #endif
> >  
> > +/* 90% of 32-bit addressable memory */
> > +#define MEMSZ_THRESHOLD_32 0xCCCCCCCCUL
> 
> Not necessarily.  That's correct for 32-bit processes running on a
> 64-bit powerpc kernel, but not always on other archs.  In particular
> if the kernel is 32-bit, there will generally only be 2-3G of
> addressable memory, because the kernel inhabits every process's
> address space.

Good point. Well, I guess we could say if the memsz of the segments of a
program are anywhere close to 2.5G, and the program dies unexpectedly
when remapped, that this might at least be a cause. Trying the same
program as a 64-bit binary would help narrow it down. That's all we're
trying to achieve.

> This will be a little tricky to fix, since on a number of archs
> (including i386) the amount of addressable memory can depend on the
> kernel config.  We'll need to find a way to determine the address
> space limit at run time, I don't know how off the top of my head.

Another good point. I don't know if there is any way that the kernel
communicates that information to userspace (on i386, wrt the
CONFIG_VMSPLIT options).

> > +
> >  #ifdef __syscall_return
> >  #ifdef __i386__
> >  /* The normal i386 syscall macros don't work with -fPIC :( */
> > @@ -231,6 +234,63 @@ static void assemble_path(char *dst, const char *fmt, 
> > ...)
> >     }
> >  }
> >  
> > +#if defined(__i386__) || defined(__powerpc32__)
> 
> We can use __LP64__ to check for a non-64-bit system here instead of
> the individual architecture #defines, we already do that elsewhere in
> elflink.c

I'll fix this up.

> Although come to that, since we'll need to determine the threshold at
> run-time anyway, we might as well always run this check, even on
> 64-bit systems.  It would just be very unlikely to trigger there,
> because the threshold would be so high.

Yep, and I guess what we really could do, to make it clear, is to modify
the threshold for 32-bit vs. 64-bit (if we remove the absolute value).

Thanks for the comments, David. I'll respin this on top of my current
tree.

Thanks,
Nish

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Libhugetlbfs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to