On Monday 08 October 2007 12:02:22 mickey wrote: > On Mon, Oct 08, 2007 at 11:53:50AM +0200, Tonnerre LOMBARD wrote: > > Salut, > > > > On Mon, Oct 08, 2007 at 09:44:48AM +0000, mickey wrote: > > > > > PAE is slow and has hairy paws. I am glad that we have real amd64 > > > > > machines now so we don't need it anymore. > > > > > > besides that what do you think amd64 runs? (: > > > it uses the same pae as i386. and it is not any faster. > > > learn what are you talking about... > > > > No, it uses 48-bit addresses and some flag bits, but it can use a 64-bit > > selector rather than two 32-bit ones, improving the performance > > significantly. > > format and amount of data is the same. > it does not matter how many bits are used or not. > it's about how much larger page tables are and how much longer > it takes for the tlb reload. > or what you think loading 36bit physaddr is slower than loading 48bits? > segments have nothing to do w/ page tables and tlb performance. > they will be as much slowdown on pae or non-pae page tables. > get a clue. you are talking about non-related improvements > and might as well compare this to sparc64 tlb performance... > > cu
in legacy mode, there is i386 that support 4KB and 4MB page-sizes and use 2-level pagetables. in legacy mode, there is i386 PAE that support 4KB and 2MB page-sizes and use 3-level pagetables. in long mode, there is amd64 that support 4KB, 2MB and 1GB page-sizes and use 4-level pagetables. i386 PAE and amd64 use the same paging-mode. The larger pagetables look like the pagewalk slows down, but actually the MMU internally does some optimizations that allow jumps w/o modifying the pages used for the pagetables. What is a real speedup is support for the large pages (4MB/2MB) and the newly introduced giga-pages (1GB) in Barcelona since they reduce TLB flushes or TLB pressures. Oh, and some off-topic hints that also result in speedups: Fine-graine locking increases speed over the biglock, a better scheduler that prevents jumping from processes between cpu-cores or even better between NUMA-nodes. Christoph