"dsc" == David Comay <David.Comay at Sun.COM> writes: dsc> Here are my comments for round "three":
dsc> usr/src/cmd/ksh/Makefile.com dsc> Lines 101-109 - As I indicated in an earlier review, I don't dsc> believe this is necessary. Both Nevada and the Solaris 10 dsc> patch gate do large pages automatically (or so-called out of dsc> the box) and so including these options is unnecessary. dsc> However, I've cc'ed Bart Smaalders who is an expert in this dsc> area who can suggest whether or not it makes sense to include dsc> this. Just to be clear, this issue is about using -xpagesize_heap=64K and -xpagesize_stack=64K on SPARC. Let me just paste the lines from file here: 101 # Use 64k pages on SPARC (32bit+64bit ; based on benchmarking on an Ultra5 102 # and a Blade1000 this is the optimum for small/medium-sized datasets (512k 103 # pages are not available on Niagara CPUs and 4M pages are far too large)). 104 # (Note that the stack should always be mapped with 64k pages (or better), 105 # heap is optional. Both heap and stack should use the same stacksize since 106 # some MMU types cannot handle more than one largepage size efficiently) 107 sparc_CFLAGS += -_cc=-xpagesize_stack=64K -_cc=-xpagesize_heap=64K 108 sparcv9_CFLAGS += -_cc=-xpagesize_stack=64K -_cc=-xpagesize_heap=64K This isn't too bad as far as it goes, but there are some potential issues. First, as David mentioned, with S10U1 and Nevada since about 2 years ago, Solaris has used large pages "out of the box" (LPOOB), i.e. the kernel picks large pages for application heap, stack and other anonymous memory based on the system's TLB architecture. I was one of the developers of this code and integrated it into Solaris. This initial code was tuned for UltraSPARC-III/III+/IV/IV+ for sun4u and T1 (Niagara 1) for sun4v. The code has tunables that allow you to tune it differently, but I've not heard of anyone who has done so. More recently, more work was done for LPOOB, but I'm not as familiar with that code. Since we tuned the code for the US-III family for sun4u, we typically used 8K and 4M pages to take advantage of the 512 entry, single page size TLB on US-III and not remap the segments very often. For IV+, we could also use 32M or 256M, but I believe there use was fairly limited. This works on UltraSPARC-II, but can't really take advantage of the data TLB there, which, if I remember correctly, can handle different page sizes at the same time, but is only 64 entries. UltraSPARC-T1 works has a similar TLB. For sun4v, we tuned LPOOB to move to the next larger page size as soon as possible, in order to use as few TLB entries as possible and try to avoid TLB misses and TLB thrashing. T1 understands translation storage buffers (TSBs) in either hardware or the hypervisor (I don't remember which) too, so TLB misses are a little cheaper, since they don't have to trap into into Solaris. But Solaris also has to handle TSB misses. Another issue is that you may still have some number of stray 8K pages due to alignment issues, e.g. the heap doesn't start on a 64K boundary. It's possible you take care of that with a mapfile, but I haven't reviewed all the Makefiles to see if that's the case. Finally, you've locked yourself into 64K pages, and it's possible the kernel will continue to be improved in the selection of large pages. These improvements may not work if you've already selected 64K pages, or perhaps the selection of 64K pages will interfere with good performance. I'd suggest you might look at tuning the LPOOB mechanism to handle UltraSPARC-II better, since that's one of platforms you care about. If you tuned LPOOB on UltraSPARC-II to force the use of 64K pages as soon as the heap and stack were 64K in size or larger, much like is done on sun4v, all programs that have stacks or heaps of that size or larger would benefit, and possibly the whole system would benefit, as it would have fewer TLB misses. I don't think Sun is interested in investing in this tuning for US-II, as we don't sell many (any?) US-II systems these days. But it would be an interesting community project if anyone is interested. -- Dave Marquardt Sun Microsystems, Inc. Austin, TX +1 512 401-1077 (SUN internal: x64077)