On Mon, 1 Jan 2001, George Woltman wrote:

> Now the bad news, when I use the prime95 memory layout where the 8 input
> values come from 8 different cache lines and the modified values are
> output to the same cache lines (an in-place FFT), the P4 code now takes 112
> clocks.
> 
> The cause is the new 64-byte cache line.  The L1 cache is write-back.
> Any changes to the L1 cache are written back to the L2 cache.  On the P3

Two things here; first, doesn't the P4 have 128 byte cache lines? Also,
do you mean write *back* or write *through*? Write back to computer
architects usually means cache lines don't automatically update the cache
level below them.

> P4 has a 64 byte L1 cache.  The update of the L2 cache is done with two
> writes of 32 bytes each taking 7 clocks.  Thus, the P4 macro takes 8*14=112
> clocks.

Are you sure of this timing? The memory bus to L2 is supposed to be 256
bits wide, and is supposed to deliver data once per clock (or maybe every
other clock, the Coppermine P3 does that). This would mean a cache line 
write back would take at least four processor clocks. Or is that
super-speed delivery only supposed to be for reads, with writes being
buffered?

> After much thought (coming up with a scheme that does not cause thrashing
> of the 64 TLBs is not easy), I think I've worked out a new memory layout
> for prime95.  This is a mostly-in-place FFT where the outputs of the macros
> are written to consecutive memory locations.  I've now begun the process
> of recoding all of the prime95 building blocks, then I'll code up the
> new memory layout for the P4.  Obviously this will take a good deal of time.

Since the new L1 cache is tiny and the L2 is very large and almost as
fast, would your life become any easier if you coded for L2 cache latency
instead of L1? Most of the work gets done in a single pass through main
memory, then the "rows" of the transform get processed in a second pass
that can barely tolerate L2 latency. Or is the latency of SSE2
instructions so high that you run out of registers and rely on renaming
already?

jasonp

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to