Hi Andrew.  This was very helpful.  It almost worked. :)  It all fit in
the chip, but it failed to meet timing on the adc0_clk line and the 156
MHz clock line.  I've messed about with it some more in an attempt to make
it meet timing.

John

> Hi John
>
> The FFT is a good place to optimise for various things. The ROACH has
> plenty
> of multipliers in comparison to the iBOB so be sure to optimise for logic
> in
> the FFT unless you are very short of multipliers and have logic to spare.
> Complex multiplication can be done with a lot of adders and 3 multipliers
> or
> 4 multipliers and 2 adders. Optimising for multipliers uses the former
> approach and optimising for logic the latter.
>
> Another place where some optimisation can be found is to enable the "Use
> DSP48s for adders" option recently added to the FFT. This uses the ALUs in
> the DSP48E blocks in the ROACH to implement the adders for the butterfly
> operation and saves a significant amount of logic in the FFT (around 1/6
> of
> slices).
>
> A final logic saving in the FFT can be had by choosing the latencies for
> various operations judiciously. Increasing latencies generally increases
> the
> amount of logic used but increases the possible clock rate of the design.
> I
> use the following latencies for the FFT in a design that makes timing at
> just over 200MHz; BRAM latency: 2, Multiplier latency: 2, Adder latency:
> 1.
> I have played a little with these and to increase the clock rate to above
> 250MHz would require increasing the BRAM latency to 3.
>
> Also be sure to specify the Virtex5 target architecture. The FFT is
> constructed to allow various optimisations using this setting. Also leave
> the "Specify multiplier use..." option unenabled.
>
> If the above steps don't help, then consider changing other things
> (accumulators etc).
>
> Regards
> Andrew
>
> 2009/11/11 John Ford <jf...@nrao.edu>
>
>> OK, so my naive port of guppi from bee2/ibob to roach failed, because I
>> ran out of resources.  After Randy and I updated the 5 designs and
>> combined them into one design, in my initial try at the port, I told the
>> dual 2^12 PFB/FFT blocks to optimize for multipliers.  I promptly ran
>> out
>> of slices.  So this time I told it to optimize for logic.  We'll see
>> what
>> that does this time...
>>
>> But it occurs to me also that we have some whopping big vector
>> accumulators (using the gavrt_library vector accumulator) that might
>> profitably use the QDR memory vacc instead of the on-chip BRAMs, which
>> might free up some room in the chip.  Also we have some reordering going
>> on to pack up the packets into the format we want.  Might be able to do
>> something different with that.
>>
>> What do you all think?  Any other re-work that I should do when porting
>> from the 7.1 bee2/ibob platforms to the 10.1 (actually, 11.3 on Linux,
>> since Windows ran out of memory) ROACH platforms?
>>
>> BTW, svn doesn't seem to have the workshop tutorial #5 in it?
>>
>> John
>>
>>
>>
>>
>



Reply via email to