"Eric W. Biederman" wrote:
> Ronald G Minnich <[EMAIL PROTECTED]> writes:
>
> > On Thu, 8 Mar 2001 [EMAIL PROTECTED] wrote:
> >
> > > Wouldn't memset(0, 0, size_of_memory) be faster since you skip reading
> > > memory?
> >
> > no, that's the key thing that Eric pointed out. You don't really skip
> > reading it, I don't think, since the cache hardware will read the cache
> > line before you write to it. Getting around that is probably tough, since
> > the cache hardware should do that.
>
> Actually the the cache hardware normally does skip this, but
> occasionally it's slow path get's triggered, so you can't count on it.
>
> However if you hack the mtrrs so that the memory is mapped
> write-combining you can write at full speed. And back to back writes
from the Alpha Architecture Handbook v4:
Description: The WH64 instruction provides a hint that the current contents of
the aligned 64-byte block containing the addressed byte will *never be read
again but will be overwritten in the near future*. The processor may allocate
cache resources to hold the block without reading its previous con-tents from
memory; the contents of the block may be set to any value that does not
introduce a security hole, as described in Section 1.6.3.
The WH64 instruction does not generate exceptions; if it encounters data
address translation errors (access violation, translation not valid, and so
forth), it is treated as a NOP. If the address maps to non-memory-like (I/O)
space, WH64 is treated as a NOP. Software
Note: This instruction is a performance hint that should be used when writing
a large continuous region of memory.
The intended code sequence consists of one WH64 instruction followed by eight
quadword stores for each aligned 64-byte region to be written. Sometimes, the
UNPREDICTABLE data will exactly match some or all of the previous contents of
the addressed block of memory.
Perhaps intel PIII's SSE has similar functionality?
>
> are faster than reading with SDRAM, (It's the way the protocol is
> implemented). So you should be able to get about 600MB/s with PC100
> SDRAM.
>
> However if we can detect a reset it is probably o.k. to skip the RAM
> initialization altogether, (because we have done it already).
>
> The hard question is how do we tell things that only work most of
> the time from techniques that work all of the time.
>
> Eric