http://everything2.com/index.pl?node_id=1382347
cache line ping-pong
One way of maintaining cache
coherence in multiprocessing
designs with CPUs
that have local caches
is to ensure that single cache lines are never held by more than one CPU
at a time. With write-through caches, this is easily implemented
by having the CPUs invalidate cache lines on snoop hits.
However, if multiple CPUs are working on the same set of data from
main memory, this can lead to the following scenario:
- CPU #1 reads a cache line from memory.
- CPU #2 reads the same line, CPU #1 snoops the access and
invalidates its local copy.
- CPU #1 needs the data again and has to re-read the entire
cache line, invalidating the copy in CPU #2 in the process.
- CPU #2 now also re-reads the entire line, invalidating the copy
in CPU #1.
- Lather, rinse, repeat.
The result is a dramatic performance loss because the CPUs keep fetching
the same data over and over again from slow main memory.
Possible solutions include:
- Use a smarter cache coherence protocol, such as MESI.
- Mark the address space in question as cache-inhibited. Most CPUs
will then resort to single-word accesses which should be faster than
reloading entire cache lines (usually 32 or 64 bytes).
- If the data set is small, make one copy in memory for each CPU.
If the data set is large and processed sequentially, have each CPU
work on a different part of it (one starting at the beginning, one at
the middle, etc.).
|
- [linuxkernelnewbies] cache line ping-p...@everything2.com peter teoh
-