On Sun, May 6, 2012 at 9:09 PM, Pekon Gupta <[email protected]> wrote:
>
>> >> > What Julius and me are talking about here is the cache itself
>> >> > containing some init stage (in hw) which would loop through the
>> >> > cache tags and invalidate them on reset.
>> >
>> >
>> > [Pekon]: In above scenario (H/W based invalidation of caches at reset),
>> > Do
>> > you plan to stall the cpu, till your H/W loop is complete ?
>>
>> A very pertinent question. The answer is that of course you have to.
>>
>> Again, I reiterate that I'm against putting in FSMs to do stuff which
>> is not strictly necessary. It really provides no performance gain
>> (aside from skipping an init loop at reset, a period during which we
>> can live with a bit of delay and not the focus of optimisation) and
>> adds complexity.
>
>
> [Pekon]: One feedback here..
>
> [S/W invalidation of cache] In most of the systems, the CPU clock speeds are
> very slow, just after reset, as clocks are "directly" sourced usually from
> some oscillator (crustal or RC). However, once CPU is out from reset, it
> programs the PLL(s), increases its clock speeds by multiple folds, and then
> invalidates the cache using S/W loop.
> Example: invalidating 1024 cache lines @ 100MHz(10ns time-period) might take
> 1024x10ns=10240ns=10.24us (micro sec)
> (considering back-2-back WRITES with minimum latency, as L1 caches are at
> least latency from CPU)
>
> [H/W invalidation of cache]: As your H/W FSM would run before CPU boots and
> programs PLL, so your H/W FSM would still be running on slow oscillator
> clock. Thus, it might take more time to invalidate than S/W.
> Example: invalidating 1024 cache lines @ 25MHz(40ns time-period) might take
> 1024x40ns=40960ns=40.96us (micro sec)

Good point.

>
> So, i think its better if H/W FSM is controllable by a "self-clearing"
> Software bit in Cache Registers.
> Example: write to a bit starts ur FSM, and the bit gets automatically
> cleared when cache is invalidated.
> I think this would be more useful, as:
> (a) S/W has more control on when to initiate invalidation of caches, like
> after programming PLLs
> (b) Frees CPU to execute next instruction in pipeline, while H/W FSM is
> invalidating caches in parallel.

(b) is probably only true for data cache, as for instruction cache
you'll be immediately putting a request to it for the next instruction
(and then presumably must wait for the invalidation FSM to complete.)
I'm still not convinced this is a feature we want. It's needless
complication of the hardware implementation for marginal gain.

Cheers

Julius
_______________________________________________
OpenRISC mailing list
[email protected]
http://lists.openrisc.net/listinfo/openrisc

Reply via email to