Hi Sebastien,

> OTOH, while your scheme would probably give good results for the
> instruction cache, the data written to the FIFO by the first process
> would miss the data cache when read by the second process because of the
> mismatched task IDs.

Indeed. For the scenario of piping data through a memory page shared
between producer and consumer, a physically indexed D-cache would
provide the best performance. The data between producer and consumer
could then travel directly through the cache regardless at where the
shared memory is mapped in each process. If I understand correctly, when
using a virtually-indexed physically-tagged cache (or task IDs for that
matter), each physical address would end up in the cache twice (one copy
for each process). Would the cache be designed to associate those cache
lines with each other? How is the consistency between both copies
maintained? I.e., is the cache line of the consumer invalidated when the
producer writes data to his respective cache line?

> BTW, we are using 4KB I and D caches in the Milkymist SoC, so if we fix
> the page size to 4KB as well we'd avoid aliasing problems entirely (the
> L2 cache would be physically indexed and tagged).

Are you speaking about the MMU page size? In our experience, the use of
different page sizes makes a huge difference for performance when using
a software-loaded TLB because each TLB miss must be resolved in
software. If the TLB has 64 entries and you pin the page size to 4K, a
massive amount of page-faults will be generated as soon as the working
set exceeds 256K of memory. With Genode on MicroBlaze, we started with
using only 4K page sizes and then went for supporting the whole range of
page sizes at a later stage. The performance (i.e., the time needed to
boot a simple application scenario) was boosted by factor 10!

> To sum up: unless I understood something incorrectly, if we use a
> virtually indexed physically tagged cache with:
>   cache associativity * page size = cache size
> we can happily context switch without taking care of the cache at all
> and without unnecessary cache flushes, cache misses or CPU pipeline
> stages.

How does this hold true when processes communicate via shared memory as
outlined above?

Cheers
Norman
_______________________________________________
http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
IRC: #milkymist@Freenode
Twitter: www.twitter.com/milkymistvj
Ideas? http://milkymist.uservoice.com

Reply via email to