On Thu, 2007-06-14 at 09:12 -0500, Dave Marquardt wrote: > "Roland" == Roland Mainz <roland.mainz at nrubsig.org> writes: > > Roland> Dave Marquardt wrote: > >> > >> "dsc" == David Comay <David.Comay at Sun.COM> writes: > >> > dsc> Here are my comments for round "three": > >> > dsc> usr/src/cmd/ksh/Makefile.com > >> > dsc> Lines 101-109 - As I indicated in an earlier review, I don't > dsc> believe this is necessary. Both Nevada and the Solaris 10 > dsc> patch gate do large pages automatically (or so-called out of > dsc> the box) and so including these options is unnecessary. > dsc> However, I've cc'ed Bart Smaalders who is an expert in this > dsc> area who can suggest whether or not it makes sense to include > dsc> this. > >> > >> Just to be clear, this issue is about using -xpagesize_heap=64K and > >> -xpagesize_stack=64K on SPARC. > > Roland> Right... > Just to be even clearer, is this for all SPARC machines or just UltraSparc I and II machines? The TLB architecture on US-III+, US-IV, US-IV+ tends not to use 64K page sizes often due to the restriction of having essentially 2 pagesizes within a process that work well together. US-III may not work well with 2 page sizes though and thus we default to 8k in sun4u/cpu/us3_cheetah.c cpu_fiximp.
The Niagara cpus I think can handle all page sizes equally well. Take a look at the following routines/variables to learn more about what page sizes are used by default: map_pgsz max_uheap_lpsize default_uheap_lpsize max_ustack_lpsize default_ustack_lpsize max_privmap_lpsize max_uidata_lpsize max_utext_lpsize max_shm_lpsize The OS really should be picking the right (or at least decent) size for the given application on the given hardware. One caveat seems to be mentioned in vm_dep.c above map_pgszheap: 483 /* 484 * Sanity control. Don't use large pages regardless of user 485 * settings if there's less than priv or shm_lpg_min_physmem memory installed. 486 * The units for this variable is 8K pages. 487 */ 488 pgcnt_t shm_lpg_min_physmem = 131072; /* 1GB */ 489 pgcnt_t privm_lpg_min_physmem = 131072; /* 1GB */ 490 I personally don't know the code that well, but do know the TLB specs which is a large determining factor in what page sizes should be used. Hope this helps, Mike > >> Let me just paste the lines from file > >> here: > >> > >> 101 # Use 64k pages on SPARC (32bit+64bit ; based on benchmarking on an > >> Ultra5 > >> 102 # and a Blade1000 this is the optimum for small/medium-sized datasets > >> (512k > >> 103 # pages are not available on Niagara CPUs and 4M pages are far too > >> large)). > >> 104 # (Note that the stack should always be mapped with 64k pages (or > >> better), > >> 105 # heap is optional. Both heap and stack should use the same stacksize > >> since > >> 106 # some MMU types cannot handle more than one largepage size > >> efficiently) > >> 107 sparc_CFLAGS += -_cc=-xpagesize_stack=64K -_cc=-xpagesize_heap=64K > >> 108 sparcv9_CFLAGS += -_cc=-xpagesize_stack=64K -_cc=-xpagesize_heap=64K > >> > >> This isn't too bad as far as it goes, but there are some potential > >> issues. > >> > >> First, as David mentioned, with S10U1 and Nevada since about 2 years > >> ago, Solaris has used large pages "out of the box" (LPOOB), i.e. the > >> kernel picks large pages for application heap, stack and other > >> anonymous memory based on the system's TLB architecture. I was one of > >> the developers of this code and integrated it into Solaris. This > >> initial code was tuned for UltraSPARC-III/III+/IV/IV+ for sun4u and T1 > >> (Niagara 1) for sun4v. The code has tunables that allow you to tune > >> it differently, but I've not heard of anyone who has done so. > > Roland> Well, this is part is horribly underdocumented and has large holes > Roland> filled with hungry komodo dragons (if you "poke" around in the values > Roland> without knowing what the side-effects are...) ... > Roland> ... is there no script/tool which cna be used to get/set such > tuneables > Roland> ? > > I agree. The tunables are there for kernel developers and support > folks to tweak should the need arise, and as such, they're > undocumented. > > >> More recently, more work was done for LPOOB, but I'm not as familiar > >> with that code. > > Roland> Ok.. but even for B61 64k pages are not used by default (or $ ksh93 -c > Roland> 'pmap -s -x $$ ; true' # shows wrong values (note the "true", > otherwise > Roland> ksh93 will |exec()| the last command and pmap shows the page map for > Roland> itself)). > > So it appears there hasn't been tuning done to help this particular > case. > > Roland> [snip] > >> Another issue is that you may still have some number of stray 8K pages > >> due to alignment issues, e.g. the heap doesn't start on a 64K > >> boundary. It's possible you take care of that with a mapfile, but I > >> haven't reviewed all the Makefiles to see if that's the case. > > Roland> Right now we don't use mapfiles for that since we only map stack&&heap > Roland> with 64k pages (AFAIK you're thinking about things like text/data > Roland> segments, right ?). We still have "stray" 8k pages coming from > Roland> allocations before the "-xpagesize_*=64k"-option has an effect - but > Roland> AFAIK we can ignore this since this memory is not used in the "hot > Roland> codepaths". > > I'm specifically thinking of the BSS & heap boundary. Heap starts at > the end of BSS, and last I knew we weren't mapping BSS on large pages. > > >> Finally, you've locked yourself into 64K pages, and it's possible the > >> kernel will continue to be improved in the selection of large pages. > >> These improvements may not work if you've already selected 64K pages, > >> or perhaps the selection of 64K pages will interfere with good > >> performance. > > Roland> Erm, based on the result we saw the 64k pages are the optimum... and > as > Roland> the comment in the Makefile says: 8k pages (the default) are not the > Roland> optimum, 512k pages are too large and not available everywhere and 4M > Roland> and 256M pages are like hunting ducks with a M1 Abrams/TUSK. > Roland> The scenario where you may be right is that if the heap usage grows > to a > Roland> size where 4M pages may become more usefull (note: we ship a 64bit > Roland> version of ksh93 for this case... remeber perl and ksh93 are used for > Roland> postprocessing and "glue" for bioinformatics applications where the > Roland> datasets quickly grow beyond 4GB (and yes, the AST memory allocator > can > Roland> handle that properly)) then the choice for 64k pages may be > sub-optimal > Roland> but I assume the kernel isn't that... uhm... "dumb" and overrides the > Roland> "hint" given by "-xpagesize_*=64k"-option. > > Well, if you use -xpagesize_*=64K, we tend to be conservative and > think you mean it! So, no, the kernel won't override your setting of > 64K if your stack or heap grow large. > > Roland> But in any case the comment for mapping the stack with 64k > Roland> remains as we did explicit optimisations in this area. > > >> I'd suggest you might look at tuning the LPOOB mechanism to handle > >> UltraSPARC-II better, > > Roland> And UltraSPARC-I, too - remeber some distributions like MarTux support > Roland> these CPUs (and I wish OpenSolaris would keep support for this because > Roland> there are huge stockpiles of UltraSPARC-1-based machines at many > Roland> universities which could be "donated" to students&co.). > > Right, it should be pretty easy to treat these the same way. > > >> since that's one of platforms you care about. > >> If you tuned LPOOB on UltraSPARC-II to force the use of 64K pages as > >> soon as the heap and stack were 64K in size or larger, much like is > >> done on sun4v, all programs that have stacks or heaps of that size or > >> larger would benefit, and possibly the whole system would benefit, as > >> it would have fewer TLB misses. > > Roland> Right... but that is a general issue with a far lager scope > Roland> than this project... right now we only discuss ksh93 and the > Roland> use of 64k largepages which aims at small/midsized datasets. > > Well, as someone who has worked on performance projects in the past at > Sun, I'm also interested in the overall performance of systems. I > suppose it might be difficult to find a workload where this will hurt. > > I'm not convinced the design and code for adding better LPOOB tuning > for UltraSPARC I & II is all that large, particularly compared to all > the work you and others have put into ksh93, but certainly, it's > outside the scope of the ksh93 project, I agree. > > >> I don't think Sun is interested in > >> investing in this tuning for US-II, as we don't sell many (any?) US-II > >> systems these days. But it would be an interesting community project > >> if anyone is interested. > > Roland> What about applying the Niagara1 defaults for UltraSPARC-1/2 > Roland> CPUs, too ? > > That was exactly my thought for the first round of tuning for > UltraSPARC 1&2. > > I've opened an RFE, CR 6569725: > > 6569725 Add better large page out of box support for UltraSPARC I and II > > As I said, I doubt Sun management will want to invest in this area due > to low return on investment for Sun, but certainly someone else in the > OpenSolaris community could take it on, or perhaps some Sun employee > in his spare time.