On Sun, Apr 22, 2012 at 7:25 AM, Jonas Bonn <jo...@southpole.se> wrote: > > On Sat, 2012-04-21 at 16:18 +0100, Julius Baxter wrote: >> On Sat, Apr 21, 2012 at 4:03 PM, Matthew Hicks <firefal...@gmail.com> wrote: >> > >> > Agreed as it impacts other stuff least. I also think that CBFRI >> > should be required as well. >> >> Hmmm, I was going to say that the functionality of the invalidate >> block ensures that dirty data is written back (flushed) before >> invalidating, but I've just checked and it says in 9.2.3: >> >> Modified data cache block is invalidated in all processors. >> >> So invalidate doesn't flush it :-S (kinda obvious from the name, though!) >> > > Invalidate is primarily interesting when somebody other than the > processor has written to memory (DMA) and you want to be sure that CPU > actually accesses the data in memory and not some stale value in cache. > It would be fatal if invalidate implied a write-back as that would > potentially be destroying the data that's there... > >> In this case, I'd say data cache block flush is mandatory too. > > I'd say these functions _could_ be optional, for a very limited use > case. On a uniprocessor system without any devices doing DMA, your > cache will always be coherent; other than at reset, you'd never need to > invalidate the cache and you'd never need to flush it at all. So if > there are any savings to be had by skipping those registers, this is > where it is. > > The question is: what are the savings? Is it worth it?
Hi Jonas I just ran synthesis on the OR1200 in the Xilinx ML501 design (Virtex 5 FPGA) and found that without write-back (write through only, no 'flush' functionality, only invalidate) the or1200 DC module had these sizes as reported by the Xilinx map tool: Slices Slice Reg LUTs | ++or1200_dc_top | 84/138 | 0/45 | 109/247 | +++or1200_dc_fsm | 54/54 | 45/45 | 138/138 ... and with write-back (so ability to 'flush' registers, also always write to bus when storing): Slices Slice Reg LUTs | ++or1200_dc_top | 85/171 | 0/46 | 108/292 | +++or1200_dc_fsm | 86/86 | 46/46 | 184/184 So almost 20% more LUTs with write-back (more control logic, only 1 more register required.) To give you a feeling of overall logic use, the OR1200 in that config had FPU, HW integer multiplier/divider, full size caches, MMUs and its overall size is: Slices Slice Reg LUTs | +or1200_top0 | | 0/3337 | 0/2571 | 0/7272 So in terms of overall size, you're looking at (171-138) 33 slices or (292-247) 45 LUTs more for write-back, so < 1% increase, but that's on a full implementation - on a smaller implementation it'd be a larger increase percentage-wise but still probably not too major a change. On a spartan implementation, though, you might still choose to go without on area-saving grounds. Cheers Julius _______________________________________________ OpenRISC mailing list OpenRISC@lists.openrisc.net http://lists.openrisc.net/listinfo/openrisc