This is a known issue on Solaris/x86. The issue is that over time, non-relocatable kernel memory is allocated at physical addresses throughout memory, fragmenting the PA space, and preventing the allocation of a physically contiguous large page. The only workaround is to start your large page application soon after a reboot, before memory becomes fragmented.
In contrast, the SPARC VM implementation includes a "kernel cage", which confines kernel memory allocations to a smaller physically contiguous region of memory, which may grow if needed. Non-kernel allocations outside of the cage can be relocated as needed to create contiguous large pages. The primary purpose of the kernel cage was to enable memory dynamic reconfiguration. Since this is not supported on x86 systems, the extra complexity of the cage was not carried forward to the amd64 VM implementation, with the unfortunate side effect of reducing large page availability. One component of the VM2 project will modify page freelist management to preserve contiguous large pages when possible, but I don't have any details on schedule. - Steve Sistare Andrew Gallatin wrote On 09/06/07 09:37,: > Are there any tuning parameters which will increase the chances that > applications will be able to use large pages on amd64 anywhere near as > well as on sparc? Or is the x86/amd64 large page implementation just > not as mature? > > I have noticed that using a fairly recent open solaris (actually > NexentaOS), on an amd64 with 512MB, pmap reports that my application > gets very few large pages, often just one. In debugging this, I > located a sparc and an x86 which had lots (8GB) of memory, and were > running the same version of the OS. Unfortunately, this version is > S10 11/06. At any rate, the sparc seems to always be able to allocate > a majority of its RAM as large pages, where the amd64 can allocate > only a very small percentage (which seems to decrease the longer the > machine has been up) > > I have written the appended trivial application which does nothing but > allocate memory, fault it in, and fork of a pmap to see what page > size(s) it ended up with. > > On the sparc I can get 6.5GB worth of large pages > % head -1 /etc/release > Solaris 10 11/06 s10s_u3wos_10 SPARC > % uname -srpiv > SunOS 5.10 Generic_118833-33 sparc SUNW,Sun-Fire-T200 > % prtconf | head -2 > System Configuration: Sun Microsystems sun4v > Memory size: 8184 Megabytes > % vmstat > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr s1 s2 s3 s4 in sy cs us sy id > 0 0 0 7095352 7590048 1 23 3 0 0 0 1 0 0 1 0 375 291 209 0 0 > 100 > % ppgsz -F -o heap=4M ./lgpg 6500 | grep heap > 0000000100102000 3064K 8K rwx-- [ heap ] > 0000000100400000 6656000K 4M rwx-- [ heap ] > > On the amd64, I can get nowhere near that number of 2MB pages, even > though the machine is freshly booted. > > % head -1 /etc/release > Solaris 10 11/06 s10x_u3wos_10 X86 > % uname -srpiv > SunOS 5.10 Generic_125101-08 i386 i86pc > % prtconf | head -2 > System Configuration: Sun Microsystems i86pc > Memory size: 8191 Megabytes > % vmstat > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr cd f0 s0 -- in sy cs us sy id > 0 0 0 6064208 6967940 59 7751 55 0 0 0 381 7 0 0 0 830 652 440 0 3 96 > % ppgsz -F -o heap=2M ./lgpg 6500 | grep heap > 0000000000412000 1976K 4K rw--- [ heap ] > 0000000000600000 813056K 2M rw--- [ heap ] > 0000000032000000 5840896K 4K rw--- [ heap ] > 0000000196800000 2048K 2M rw--- [ heap ] > > > Thanks for any light you all can shed on this, > > Drew > > #include <stdlib.h> > #include <stdio.h> > #include <string.h> > #include <unistd.h> > > void > usage(char *name) > { > fprintf(stderr, "usage: %s alloc_size_megs", name); > exit(1); > } > > > int > main(int argc, char **argv) > { > char buf[64]; > size_t alloc_size; > char *ptr; > int pid; > > > pid = (int)getpid(); > if (argc < 2) > usage(argv[0]); > > alloc_size = atoi(argv[1]); > alloc_size *= 1024UL * 1024UL; > ptr = malloc(alloc_size); > if (ptr == NULL) { > perror("malloc"); > exit(1); > } > > memset(ptr, 1, alloc_size); > sprintf(buf, "pmap -s %d\n", pid); > system(buf); > return(0); > } > > _______________________________________________ > perf-discuss mailing list > perf-discuss@opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org