On Monday 08 October 2007 12:02:22 mickey wrote:
> On Mon, Oct 08, 2007 at 11:53:50AM +0200, Tonnerre LOMBARD wrote:
> > Salut,
> >
> > On Mon, Oct 08, 2007 at 09:44:48AM +0000, mickey wrote:
> > > > > PAE is slow and has hairy paws. I am glad that we have real amd64
> > > > > machines now so we don't need it anymore.
> > >
> > > besides that what do you think amd64 runs? (:
> > > it uses the same pae as i386. and it is not any faster.
> > > learn what are you talking about...
> >
> > No, it uses 48-bit addresses and some flag bits, but it can use a 64-bit
> > selector rather than two 32-bit ones, improving the performance
> > significantly.
>
> format and amount of data is the same.
> it does not matter how many bits are used or not.
> it's about how much larger page tables are and how much longer
> it takes for the tlb reload.
> or what you think loading 36bit physaddr is slower than loading 48bits?
> segments have nothing to do w/ page tables and tlb performance.
> they will be as much slowdown on pae or non-pae page tables.
> get a clue. you are talking about non-related improvements
> and might as well compare this to sparc64 tlb performance...
>
> cu

in legacy mode, there is i386 that support 4KB and 4MB page-sizes and 
use 2-level pagetables.
in legacy mode, there is i386 PAE that support 4KB and 2MB page-sizes
and use 3-level pagetables.

in long mode, there is amd64 that support 4KB, 2MB and 1GB page-sizes
and use 4-level pagetables.

i386 PAE and amd64 use the same paging-mode.
The larger pagetables look like the pagewalk slows down, but actually
the MMU internally does some optimizations that allow jumps w/o modifying
the pages used for the pagetables.

What is a real speedup is support for the large pages (4MB/2MB) and the
newly introduced giga-pages (1GB) in Barcelona since they
reduce TLB flushes or TLB pressures.

Oh, and some off-topic hints that also result in speedups:
Fine-graine locking increases speed over the biglock, a better scheduler
that prevents jumping from processes between cpu-cores or even better
between NUMA-nodes.


Christoph

Reply via email to