I've noticed that the Virtex 6 BRAM clock-to-output time is kind of long when 
NOT using the optional output register (Trcko_DO = 2.08 ns) as compared to 
using the optional output register (Trcko_DOA_REG = 0.75 ns).  Using the 
optional output register adds an extra cycle of latency, but that extra 1.33 ns 
could be worth it especially since the optional output register uses (or at 
least sounds like it uses) dedicated flip-flops in the BRAM rather than CLB 
flip-flops.

Setting a Xilinx "Single Port RAM" block's latency to 2 will enable use of the 
optional output register.  Unfortunately, the CASPER "Shared BRAM" block does 
not have a latency setting (defaults to 1?) so the tools do not end up using 
the underlying BRAM's optional output register (even if there is a register on 
the Shared BRAM's output that could, in theory, be absorbed into the underlying 
BRAM).

It would be good to add an optional "Latency" parameter to the Shared BRAM 
block and allow the user to select "1" (current value that does not use the 
BRAMs optional output register) or "2" (new value that does use the BRAM's 
optional output register).  I think this would help ROACH2 designs meet timing 
more easily.  I will look at this in more detail to see how involved it would 
be to add this feature.  Once added, I think we'll want to set the default to 
2.  It would also be good to make sure the PPC side of the shared BRAM also 
uses the optional output registers.

I think we should also be recommending that regular (i.e. non-yellow) BRAM 
blocks be set to a latency of 2 at a minimum.  Maybe we are already?

This also seems like an issue for ROACH as well (though the smaller chip seems 
faster to cross so maybe not so important as on ROACH2?).

Dave


Reply via email to