Thanks everyone, I've managed to get a modified tutorial 3 to compile
(with Jack and Dave's help), will check its performance this afternoon.
For the mailing list record, here's some notes for getting things
running at 250MHz:
* The pfb_fir_real caused timing issues due to its adders and its
convert. Changing the adder latency to 4, then breaking the mask and
init script, I manually went into the adder blocks, changed the
implentation from "Use behavioual HDL" to "Pipeline for maximum
performance" and "implement using DSP48". Similarly, with the convert
blocks in the pfb, make sure "pipeline for maximum performance" is
selected. I've also got these on truncate, instead of round.
* The FFT_wideband_real compiles at 250MHz - I used 6 add latency (might
be overkill), 3 mult, 3 BRAM, 3 convert, use less logic, truncate, wrap.
* The round block (under the quant0 block), pipelined the converts as above
* The adder in the vector accumulators vacc0 and vacc1
* The counter in acc_cntrl I changed to implement using DSP48
* The counter in the pulse extenders also changed to DSP48
In addition to this, I added some of the pipeline delays in a few places.
The moral of the story is that DSP48s and pipelining is the key to
meeting timing. If you increase delay somewhere make sure you don't
forget to match it in other areas so everything is in sync. The timing
report gives you a good clue as to what's failing, and once you know
where to look it's not too hard to fix. Of course, you'll be using a lot
more DSP48s, so large designs will likely run out.
Hope that helps!
Cheers
Danny
On 07/09/2010 18:08, David MacMahon wrote:
On Sep 3, 2010, at 8:20 , Jack Hickish wrote:
After a compile fails, it's worth checking the timing report in the
compile directory ..../XPS_ROACH_BASE/implementation/system.twr
Whilst a little bit cryptic, the report should at least give you some
idea of which bits of the design are causing timing failure. It
becomes reasonably clear if it's adders in the FIR, or casts in the
FFT for example.
I concur! Blindly adding additional latency will certainly change
things and maybe result in a deign that can meet timing, but it can
(and often does) also result in unnecessary additional resource
utilization. Having a better understanding of where timing is failing
will lead to a more targeted solution.
I think it would be useful to track where timing problems occur across
different designs. If this points to a particular library block as
problematic across multiple designs, it would serve as justification
for additional work on that block.
Dave