On Thu, 2010-12-30 at 13:05 +0100, Sebastien Bourdeauducq wrote:
> On Thu, 2010-12-30 at 19:57 +0800, haimag ren wrote:
> > I modified your AVR compatible soft-core(softusb_navre.v) for speed up
> > branch instrction, docode next pc directly. 
> 
> The wait state that you removed (STALL) is here to flush the first stage
> of the pipeline (instruction fetch) on branches, to avoid control
> hazards. It should be kept.

Ah, well, I didn't see you modified the PC decoder as well. So, you
basically turned the 2-stage pipeline into a 1-stage pipeline, which
indeed does not need this wait state. The problem with this approach is
typical FPGA block RAMs, especially when made big (> a few kB) by
combining several BRAM elements, have long setup and clock-to-output
times, so I implemented this 2-stage pipeline to avoid placing a complex
combinatorial path between the output (data) and the input (address) of
the instruction RAM. The result is that the clock frequency can be
higher, but at the expense of having to wait one cycle on every jump
instruction (I haven't made figures, but it would be interesting to
measure some). Btw the original AVR went for this trade-off as well,
probably for similar reasons (embedded flash memory is slow).

S.


_______________________________________________
http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
IRC: #milkym...@freenode
Twitter: www.twitter.com/milkymistvj
Ideas? http://milkymist.uservoice.com

Reply via email to