Hi John

The FFT is a good place to optimise for various things. The ROACH has plenty
of multipliers in comparison to the iBOB so be sure to optimise for logic in
the FFT unless you are very short of multipliers and have logic to spare.
Complex multiplication can be done with a lot of adders and 3 multipliers or
4 multipliers and 2 adders. Optimising for multipliers uses the former
approach and optimising for logic the latter.

Another place where some optimisation can be found is to enable the "Use
DSP48s for adders" option recently added to the FFT. This uses the ALUs in
the DSP48E blocks in the ROACH to implement the adders for the butterfly
operation and saves a significant amount of logic in the FFT (around 1/6 of
slices).

A final logic saving in the FFT can be had by choosing the latencies for
various operations judiciously. Increasing latencies generally increases the
amount of logic used but increases the possible clock rate of the design. I
use the following latencies for the FFT in a design that makes timing at
just over 200MHz; BRAM latency: 2, Multiplier latency: 2, Adder latency: 1.
I have played a little with these and to increase the clock rate to above
250MHz would require increasing the BRAM latency to 3.

Also be sure to specify the Virtex5 target architecture. The FFT is
constructed to allow various optimisations using this setting. Also leave
the "Specify multiplier use..." option unenabled.

If the above steps don't help, then consider changing other things
(accumulators etc).

Regards
Andrew

2009/11/11 John Ford <[email protected]>

> OK, so my naive port of guppi from bee2/ibob to roach failed, because I
> ran out of resources.  After Randy and I updated the 5 designs and
> combined them into one design, in my initial try at the port, I told the
> dual 2^12 PFB/FFT blocks to optimize for multipliers.  I promptly ran out
> of slices.  So this time I told it to optimize for logic.  We'll see what
> that does this time...
>
> But it occurs to me also that we have some whopping big vector
> accumulators (using the gavrt_library vector accumulator) that might
> profitably use the QDR memory vacc instead of the on-chip BRAMs, which
> might free up some room in the chip.  Also we have some reordering going
> on to pack up the packets into the format we want.  Might be able to do
> something different with that.
>
> What do you all think?  Any other re-work that I should do when porting
> from the 7.1 bee2/ibob platforms to the 10.1 (actually, 11.3 on Linux,
> since Windows ran out of memory) ROACH platforms?
>
> BTW, svn doesn't seem to have the workshop tutorial #5 in it?
>
> John
>
>
>
>

Reply via email to