On Sun, Apr 22, 2012 at 7:25 AM, Jonas Bonn <jo...@southpole.se> wrote:
>
> On Sat, 2012-04-21 at 16:18 +0100, Julius Baxter wrote:
>> On Sat, Apr 21, 2012 at 4:03 PM, Matthew Hicks <firefal...@gmail.com> wrote:
>> >
>> > Agreed as it impacts other stuff least.  I also think that CBFRI
>> > should be required as well.
>>
>> Hmmm, I was going to say that the functionality of the invalidate
>> block ensures that dirty data is written back (flushed) before
>> invalidating, but I've just checked and it says in 9.2.3:
>>
>>   Modified data cache block is invalidated in all processors.
>>
>> So invalidate doesn't flush it :-S (kinda obvious from the name, though!)
>>
>
> Invalidate is primarily interesting when somebody other than the
> processor has written to memory (DMA) and you want to be sure that CPU
> actually accesses the data in memory and not some stale value in cache.
> It would be fatal if invalidate implied a write-back as that would
> potentially be destroying the data that's there...
>
>> In this case, I'd say data cache block flush is mandatory too.
>
> I'd say these functions _could_ be optional, for a very limited use
> case.  On a uniprocessor system without any devices doing DMA, your
> cache will always be coherent; other than at reset, you'd never need to
> invalidate the cache and you'd never need to flush it at all.  So if
> there are any savings to be had by skipping those registers, this is
> where it is.
>
> The question is:  what are the savings?  Is it worth it?

Hi Jonas

I just ran synthesis on the OR1200 in the Xilinx ML501 design (Virtex
5 FPGA) and found that without write-back (write through only, no
'flush' functionality, only invalidate) the or1200 DC module had these
sizes as reported by the Xilinx map tool:

                     Slices          Slice Reg       LUTs
| ++or1200_dc_top  | 84/138        | 0/45          | 109/247
| +++or1200_dc_fsm | 54/54         | 45/45         | 138/138

... and with write-back (so ability to 'flush' registers, also always
write to bus when storing):

                     Slices          Slice Reg       LUTs
| ++or1200_dc_top  | 85/171        | 0/46          | 108/292
| +++or1200_dc_fsm | 86/86         | 46/46         | 184/184

So almost 20% more LUTs with write-back (more control logic, only 1
more register required.)

To give you a feeling of overall logic use, the OR1200 in that config
had FPU, HW integer multiplier/divider, full size caches, MMUs and its
overall size is:

                     Slices          Slice Reg       LUTs
| +or1200_top0                           |           | 0/3337        |
0/2571        | 0/7272

So in terms of overall size, you're looking at (171-138) 33 slices or
(292-247) 45 LUTs more for write-back, so < 1% increase, but that's on
a full implementation - on a smaller implementation it'd be a larger
increase percentage-wise but still probably not too major a change. On
a spartan implementation, though, you might still choose to go without
on area-saving grounds.

Cheers

Julius
_______________________________________________
OpenRISC mailing list
OpenRISC@lists.openrisc.net
http://lists.openrisc.net/listinfo/openrisc

Reply via email to