On Wednesday, 2 April 2014 at 10:21:50 UTC, Ola Fosheim Grøstad
wrote:
either way memory stores/loads generaly have at best a 3 cycle latency.

Because the CPU has to check the dirty flag of the L3 cacheline in case another core have a dirty L1 from a store to the same memory?

You don't even come close to L3 in 3 cycles. Propagating signal
takes time. You end up with 2 constraint in tension: the bigger
your cache, the longer the round trip.

That is why we have L1 cache of 32kb for ages now. Making it
bigger would require to increase the response time, which lower
the performances.

Reply via email to