On Thu, 2010-12-30 at 13:05 +0100, Sebastien Bourdeauducq wrote: > On Thu, 2010-12-30 at 19:57 +0800, haimag ren wrote: > > I modified your AVR compatible soft-core(softusb_navre.v) for speed up > > branch instrction, docode next pc directly. > > The wait state that you removed (STALL) is here to flush the first stage > of the pipeline (instruction fetch) on branches, to avoid control > hazards. It should be kept.
Ah, well, I didn't see you modified the PC decoder as well. So, you basically turned the 2-stage pipeline into a 1-stage pipeline, which indeed does not need this wait state. The problem with this approach is typical FPGA block RAMs, especially when made big (> a few kB) by combining several BRAM elements, have long setup and clock-to-output times, so I implemented this 2-stage pipeline to avoid placing a complex combinatorial path between the output (data) and the input (address) of the instruction RAM. The result is that the clock frequency can be higher, but at the expense of having to wait one cycle on every jump instruction (I haven't made figures, but it would be interesting to measure some). Btw the original AVR went for this trade-off as well, probably for similar reasons (embedded flash memory is slow). S. _______________________________________________ http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org IRC: #milkym...@freenode Twitter: www.twitter.com/milkymistvj Ideas? http://milkymist.uservoice.com
