On 2012-07-09 00:47, Sebastian Kuzminsky wrote: > Hi folks, Im playing around with huge pages on x86-64, and Im running > into behavior Im not understanding. > > Im on Linux 3.2 (Ubuntu Precise), with libhugetlbfs 2.11. > > Heres a representative part of my /proc/cpuinfo, particularly note > that the CPU claims to support both 2MB and 1 GB pages: > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 6 > > model : 44 > > model name : Intel(R) Xeon(R) CPU X5680 > @ 3.33GHz > > stepping : 2 > > microcode : 0x14 > > cpu MHz : 1600.000 > > cache size : 12288 KB > > physical id : 0 > > siblings : 12 > > core id : 0 > > cpu cores : 6 > > apicid : 0 > > initial apicid : 0 > > fpu : yes > > fpu_exception : yes > > cpuid level : 11 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss > ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs > bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > sse4_1 sse4_2 popcnt aes lahf_lm ida arat epb dts tpr_shadow vnmi > flexpriority ept vpid > > bogomips : 6667.00 > > clflush size : 64 > > cache_alignment : 64 > > address sizes : 40 bits physical, 48 bits virtual > > power management: > > Ive separately tested this machine with 2MB pages, and that works > very > well. But 1GB pages dont seem to work at all. > > I booted the kernel with "hugepagesz=1g hugepages=10 > default_hugepagesz=1g", and got this in my dmesg: > > [ 0.000000] Command line: > BOOT_IMAGE=/boot/vmlinuz-3.2.0-26-generic > root=UUID=8e6ac55e-4c43-4f55-ae05-1264d442e8d1 ro quiet splash > vt.handoff=7 hugepagesz=1g hugepages=10 default_hugepagesz=1g > > … > > [ 3.072854] HugeTLB registered 1 GB page size, pre-allocated 10 > pages > > My /proc/meminfo looks reasonable: > > AnonHugePages: 0 kB > > HugePages_Total: 10 > > HugePages_Free: 10 > > HugePages_Rsvd: 0 > > HugePages_Surp: 0 > > Hugepagesize: 1048576 kB > > So far it all looks right, right? > > But when I try to actually *use* 1 gig pages, i get nearly as many > TLB > misses as if I was using 4k pages. > > I wrote a test program that mallocs a big array of uint32s and then > reads it in a random order, and I use perf stat -e dTLB-load-misses > to > track TLB misses. (Id be happy to provide the source code for this > program, but its really dead simple.) The test allocates an aligned > 2 gig chunk of memory and reads 10 billion random locations. > > perf reports (within a few percent) the same number of dTLB load > misses with 1G pages as I get with 4k pages. What gives? Any > ideas? > > I noticed this errata from Intel: > > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-5600-specification-update.pdf > > Specifically notice BD64, which says: > > BD64. A Page Fault May Not be Generated When the PS bit is set to "1" > in a PML4E or PDPTE > > Problem: On processors supporting Intel® 64 architecture, the PS bit > (Page Size, bit 7) is reserved in PML4Es and PDPTEs. If the > translation of the linear address of a memory > > access encounters a PML4E or a PDPTE with PS set to 1, a page fault > should occur. Due to this erratum, PS of such an entry is ignored and > no page fault will occur due to its > > being set. > > Implication: Software may not operate properly if it relies on the > processor to deliver page faults when reserved bits are set in > paging-structure entries. > > Workaround: Software should not set bit 7 in any PML4E or PDPTE that > has Present Bit (Bit 0) set to "1". > > Does anyone have experience with this particular CPU? > > Are 1 gig pages working and Im just doing it wrong? > > -- > > Sebastian Kuzminsky >
I can't help much with checking the actual hardware support, but we are currently seeing trouble with malloc backed by huge pages. At some point, glibc started ignoring the __morecore hook and I have not yet tracked down the how/why of it. For testing 1GB pages, I recommend you try one of the functions in libhugetlbfs that gives direct access to huge pages (see get_huge_pages()). You can check if it is using a 1GB page by reading /proc/meminfo while the program is active, alternatively there is a helper in our test suite that will read and parse /proc/self/smaps to determine if a given address is mapped to a huge page. Eric ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel