On Thu, 2007-06-14 at 09:12 -0500, Dave Marquardt wrote:
> "Roland" == Roland Mainz <roland.mainz at nrubsig.org> writes:
> 
> Roland> Dave Marquardt wrote:
> >> 
> >> "dsc" == David Comay <David.Comay at Sun.COM> writes:
> >> 
> dsc> Here are my comments for round "three":
> >> 
> dsc> usr/src/cmd/ksh/Makefile.com
> >> 
> dsc> Lines 101-109 - As I indicated in an earlier review, I don't
> dsc> believe this is necessary.  Both Nevada and the Solaris 10
> dsc> patch gate do large pages automatically (or so-called out of
> dsc> the box) and so including these options is unnecessary.
> dsc> However, I've cc'ed Bart Smaalders who is an expert in this
> dsc> area who can suggest whether or not it makes sense to include
> dsc> this.
> >> 
> >> Just to be clear, this issue is about using -xpagesize_heap=64K and
> >> -xpagesize_stack=64K on SPARC.
> 
> Roland> Right...
> 
Just to be even clearer, is this for all SPARC machines or just
UltraSparc I and II machines?
The TLB architecture on US-III+, US-IV, US-IV+ tends not to use 64K page
sizes often due to the restriction of having essentially 2 pagesizes
within a process that work well together.  US-III may not work well with
2 page sizes though and thus we default to 8k in sun4u/cpu/us3_cheetah.c
cpu_fiximp.

The Niagara cpus I think can handle all page sizes equally well.

Take a look at the following routines/variables to learn more about what
page sizes are used by default:

map_pgsz
max_uheap_lpsize
default_uheap_lpsize
max_ustack_lpsize
default_ustack_lpsize
max_privmap_lpsize
max_uidata_lpsize
max_utext_lpsize
max_shm_lpsize

The OS really should be picking the right (or at least decent) size for
the given application on the given hardware.  

One caveat seems to be mentioned in vm_dep.c above map_pgszheap:

    483 /*
    484  * Sanity control. Don't use large pages regardless of user
    485  * settings if there's less than priv or shm_lpg_min_physmem memory 
installed.
    486  * The units for this variable is 8K pages.
    487  */
    488 pgcnt_t shm_lpg_min_physmem = 131072;                   /* 1GB */
    489 pgcnt_t privm_lpg_min_physmem = 131072;                 /* 1GB */
    490 

I personally don't know the code that well, but do know the TLB specs
which is a large determining factor in what page sizes should be used.

Hope this helps,

Mike


> >> Let me just paste the lines from file
> >> here:
> >> 
> >> 101 # Use 64k pages on SPARC (32bit+64bit ; based on benchmarking on an 
> >> Ultra5
> >> 102 # and a Blade1000 this is the optimum for small/medium-sized datasets 
> >> (512k
> >> 103 # pages are not available on Niagara CPUs and 4M pages are far too 
> >> large)).
> >> 104 # (Note that the stack should always be mapped with 64k pages (or 
> >> better),
> >> 105 # heap is optional. Both heap and stack should use the same stacksize 
> >> since
> >> 106 # some MMU types cannot handle more than one largepage size 
> >> efficiently)
> >> 107 sparc_CFLAGS   += -_cc=-xpagesize_stack=64K -_cc=-xpagesize_heap=64K
> >> 108 sparcv9_CFLAGS += -_cc=-xpagesize_stack=64K -_cc=-xpagesize_heap=64K
> >> 
> >> This isn't too bad as far as it goes, but there are some potential
> >> issues.
> >> 
> >> First, as David mentioned, with S10U1 and Nevada since about 2 years
> >> ago, Solaris has used large pages "out of the box" (LPOOB), i.e. the
> >> kernel picks large pages for application heap, stack and other
> >> anonymous memory based on the system's TLB architecture.  I was one of
> >> the developers of this code and integrated it into Solaris.  This
> >> initial code was tuned for UltraSPARC-III/III+/IV/IV+ for sun4u and T1
> >> (Niagara 1) for sun4v.  The code has tunables that allow you to tune
> >> it differently, but I've not heard of anyone who has done so.
> 
> Roland> Well, this is part is horribly underdocumented and has large holes
> Roland> filled with hungry komodo dragons (if you "poke" around in the values
> Roland> without knowing what the side-effects are...) ...
> Roland> ... is there no script/tool which cna be used to get/set such 
> tuneables
> Roland> ?
> 
> I agree.  The tunables are there for kernel developers and support
> folks to tweak should the need arise, and as such, they're
> undocumented.
> 
> >> More recently, more work was done for LPOOB, but I'm not as familiar
> >> with that code.
> 
> Roland> Ok.. but even for B61 64k pages are not used by default (or $ ksh93 -c
> Roland> 'pmap -s -x $$ ; true' # shows wrong values (note the "true", 
> otherwise
> Roland> ksh93 will |exec()| the last command and pmap shows the page map for
> Roland> itself)).
> 
> So it appears there hasn't been tuning done to help this particular
> case.
> 
> Roland> [snip]
> >> Another issue is that you may still have some number of stray 8K pages
> >> due to alignment issues, e.g. the heap doesn't start on a 64K
> >> boundary.  It's possible you take care of that with a mapfile, but I
> >> haven't reviewed all the Makefiles to see if that's the case.
> 
> Roland> Right now we don't use mapfiles for that since we only map stack&&heap
> Roland> with 64k pages (AFAIK you're thinking about things like text/data
> Roland> segments, right ?). We still have "stray" 8k pages coming from
> Roland> allocations before the "-xpagesize_*=64k"-option has an effect - but
> Roland> AFAIK we can ignore this since this memory is not used in the "hot
> Roland> codepaths".
> 
> I'm specifically thinking of the BSS & heap boundary.  Heap starts at
> the end of BSS, and last I knew we weren't mapping BSS on large pages.
> 
> >> Finally, you've locked yourself into 64K pages, and it's possible the
> >> kernel will continue to be improved in the selection of large pages.
> >> These improvements may not work if you've already selected 64K pages,
> >> or perhaps the selection of 64K pages will interfere with good
> >> performance.
> 
> Roland> Erm, based on the result we saw the 64k pages are the optimum... and 
> as
> Roland> the comment in the Makefile says: 8k pages (the default) are not the
> Roland> optimum, 512k pages are too large and not available everywhere and 4M
> Roland> and 256M pages are like hunting ducks with a M1 Abrams/TUSK.
> Roland> The scenario where you may be right is that if the heap usage grows 
> to a
> Roland> size where 4M pages may become more usefull (note: we ship a 64bit
> Roland> version of ksh93 for this case... remeber perl and ksh93 are used for
> Roland> postprocessing and "glue" for bioinformatics applications where the
> Roland> datasets quickly grow beyond 4GB (and yes, the AST memory allocator 
> can
> Roland> handle that properly)) then the choice for 64k pages may be 
> sub-optimal
> Roland> but I assume the kernel isn't that... uhm... "dumb" and overrides the
> Roland> "hint" given by "-xpagesize_*=64k"-option.
> 
> Well, if you use -xpagesize_*=64K, we tend to be conservative and
> think you mean it!  So, no, the kernel won't override your setting of
> 64K if your stack or heap grow large.
> 
> Roland> But in any case the comment for mapping the stack with 64k
> Roland> remains as we did explicit optimisations in this area.
> 
> >> I'd suggest you might look at tuning the LPOOB mechanism to handle
> >> UltraSPARC-II better,
> 
> Roland> And UltraSPARC-I, too - remeber some distributions like MarTux support
> Roland> these CPUs (and I wish OpenSolaris would keep support for this because
> Roland> there are huge stockpiles of UltraSPARC-1-based machines at many
> Roland> universities which could be "donated" to students&co.).
> 
> Right, it should be pretty easy to treat these the same way.
> 
> >> since that's one of platforms you care about.
> >> If you tuned LPOOB on UltraSPARC-II to force the use of 64K pages as
> >> soon as the heap and stack were 64K in size or larger, much like is
> >> done on sun4v, all programs that have stacks or heaps of that size or
> >> larger would benefit, and possibly the whole system would benefit, as
> >> it would have fewer TLB misses.
> 
> Roland> Right... but that is a general issue with a far lager scope
> Roland> than this project... right now we only discuss ksh93 and the
> Roland> use of 64k largepages which aims at small/midsized datasets.
> 
> Well, as someone who has worked on performance projects in the past at
> Sun, I'm also interested in the overall performance of systems.  I
> suppose it might be difficult to find a workload where this will hurt.
> 
> I'm not convinced the design and code for adding better LPOOB tuning
> for UltraSPARC I & II is all that large, particularly compared to all
> the work you and others have put into ksh93, but certainly, it's
> outside the scope of the ksh93 project, I agree.
> 
> >> I don't think Sun is interested in
> >> investing in this tuning for US-II, as we don't sell many (any?) US-II
> >> systems these days.  But it would be an interesting community project
> >> if anyone is interested.
> 
> Roland> What about applying the Niagara1 defaults for UltraSPARC-1/2
> Roland> CPUs, too ?
> 
> That was exactly my thought for the first round of tuning for
> UltraSPARC 1&2.
> 
> I've opened an RFE, CR 6569725:
> 
> 6569725 Add better large page out of box support for UltraSPARC I and II
> 
> As I said, I doubt Sun management will want to invest in this area due
> to low return on investment for Sun, but certainly someone else in the
> OpenSolaris community could take it on, or perhaps some Sun employee
> in his spare time.


Reply via email to