:...
I would also like to add a few more notes in regards to write pipelines.
Write pipelines are not used any more, at least not long ones. The
reason is simply the cache coherency issue again. Until the data is
actually written into the L1 cache, it is acoherent.
Acoherency is no longer allowed :-). That kind creates a problem for
the old write pipeline.
Thus, modern cpu's have begun removing their write pipelines in favor
of a synchronous write to the L1 cache. This typically occurs in the
"write-back" phase of the instruction pipeline. It does not cause
a processor to stall.
There are several advantages to this:
* First, since the write is going directly into the L1 cache,
it can partake of the cache coherency protocols.
* Second, since the L1 cache is large compared to the older
write-pipeline schema, this effectively gives us a
(for all intents and purposes other then a memory copy)
infinitely-sized write pipeline.
* Third, we can throw away all the write pipeline address snooping
junk on the chip, making the chip smaller.
The cache coherency protocol may cause side effects when a memory
write is issued. Typically the write is issued into the L1 cache
and requires what is known as a write-through operation into the L2
cache in order to guarentee cache coherency. This is necessary because
it is the L2 cache which is performing the bulk of the bus protocol
to implement the cache coherency, and it needs to know when you've
modified something.
"write through" *used* to mean writing through to main memory, but with
the advent of modern cache coherency protocols, the L2 cache *IS* main
memory for all intents and purposes. Dirty data contained in the L2
cache inherently invalidates the associated memory address in main memory
simply by being present in the L2 cache. Another processor attempting
to access that location in main memory will instead obtain the data
directly from our processor's L2 cache.
One of the interesting benefits from this is that dirty data in the L2
cache can be flushed to main memory completely asynchronously without
screwing up cache coherency, serialization, or anything else.
Final note: memory mapped I/O spaces do not apply here. These are usually
marked uncacheable and have nothing to do with the cache. But DMA is
different. DMA works just like any other memory access in that it
can be made to run through the L2 cache coherency protocols. So you
can often safely initiate DMA without having to flush your caches. This
is not true of all processors. I'm not sure about the Pentium's.
-Matt
Matthew Dillon
<[EMAIL PROTECTED]>
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message