On Wed, 2011-04-13 at 01:41 +0100, Philip Pemberton wrote:
> > Instead run the TLB lookup
> > at the same time as the cache lookup (i.e. virtually indexed and
> > physically tagged caches). Yes, this requires messing a bit with the
> > CPU pipeline, but let's leave the 30-second-memcpy-of-a-8MB-buffer
> > kind of designs to the OpenRISC and ZPU people.
>
> So what you're talking about is slotting the mapper in between the CPU
> and cache, mapping the logical address into a physical one before the
> cache sees it?
No, both the TLB and the cache are indexed by the virtual address and
operate in parallel (which completely hides the TLB latency). The TLB
and cache hit detections are also both done at the same time in the next
pipeline stage.
/---> TLB ---> physaddr --\
virtaddr --| |--> compare
\---> cache ---> phystag --/
see also:
http://www.cs.princeton.edu/courses/archive/fall04/cos471/lectures/19-VirtualMemory.pdf
> I wonder what effect this would have on the pipeline... a multicycle MMU
> probably wouldn't be possible without causing major problems with the
> CPU timing.
A simple TLB does not need to be multi cycle. You can manage it in
software too, i.e. there's only one exception for all TLB misses and the
OS takes care of the rest. IMO no need for complex hardware managed
backing in system memory, especially since we have relatively large
block RAMs available.
S.
_______________________________________________
http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
IRC: #milkymist@Freenode
Twitter: www.twitter.com/milkymistvj
Ideas? http://milkymist.uservoice.com