Hi Michael,

As one of the Xilinx timing closure documents helpfully articulates, it's
very difficult to give specific recipes for solving timing problems, other
than reading the timing report, looking at things in PlanAhead/FPGA Editor,
iterating compiles and developing some intuition.

That said, there are a few things you could try, of varying levels of
complexity --
Since you have quite a lot of channels in your design, I suspect your vacc
problems are related to the size of the bram buffer (especially if your
samples are wide). Keep in mind that a bram in a Virtex6 is a 36 bit wide x
1k deep memory block, so building a vacc to accommodate many thousands of
channels means banking together lots of brams -- this is liable to cause
fanout issues on your address/control signals. The same is probably true of
the large delays in the PFB blocks, though I'm not sure how these are
implemented in the various block versions which exist in mlib_devel.
Setting brams to optimize for speed and giving them at least 3 cycles
latency may help. You could also try manually controlling bram control
signals -- I believe the casper_library_bus/bus_single_port_ram block will
do some of this for you if you set to optimize for speed and include some
fanout latency. If you replace the bram delay in the vacc with one of these
(and associated address counter logic to make the block act as a delay)
then perhaps this will help. Be careful to get your latencies right so the
block still functions properly.
You might also find that implementing the adder in a DSP slice (if you
haven't already) might help.

A more involved option is to go about manually constraining the placement
of components. You might find setting up pblocks in PlanAhead to place
major parts of your design might free up the compiler to do a little better
with the remainder of the logic. Adding pblocks for the pfb, and the
various FFT stages can be very helpful in this regard. Or perhaps just
constraining the vacc that's causing you problems might be enough.
Once you have some constraints from planahead which work, you can
auto-include them in your simulink compiles using various methods, my
favourite being the 'UCF' yellow block, which is in the current
casper-astro mlib_devel.

In general, I suspect that you would be well served by trying to reduce
your bram use in your model - the resource utilisation report can probably
either confirm or contest that this is the case. You could do this by
trying to use alternatives to brams where blocks allow it (for example the
FFT will allow coefficients to be generated using DSPs in some cases), or
by reducing the number of FIR taps, or implementing something using QDR
rather than bram if appropriate.

The unfortunate truth is that some, none or all of these suggestions may
help, or might make things worse. I would say that in my experience
over-judicious use of pipeline/register stages throughout a design can
often do more harm than good, and a sensibly placed PFB can result in
incredible timing improvements.

Hope that helps, please email back with any more info and/or updates -- I
suspect if you solve this problem documenting your method may prove very
useful for others encountering similar problems.

Cheers, and good luck!
Jack

On 30 August 2015 at 10:42, Michael D'Cruze <
michael.dcr...@postgrad.manchester.ac.uk> wrote:
>
> Hi everyone,
>
>
>
> I’m having quite a lot of problems getting a particular design to
compile, XPS consistently reporting timing errors. The design is a wideband
spectrometer, somewhat similar to the tutorial 3 design. I’m running an
iADC at 1024 MHz, and the FPGA at 256 MHz. It is a two-polarisation design,
with a 64k-point PFB and FFT (for 32k channels) in each polarisation. I am
running the PFB and FFT blocks in “two-polarisation” mode, so there are
just one of each block. I am using the Vacc block from the xBlocks
repository.
>
>
>
> I should add at this point that the same design with 16k channels
compiles successfully at this speed, and that the 32k channel design
compiles at slower speeds (e.g. 200 MHz FPGA).
>
>
>
> The timing reports suggest negative slack in the PFB and Vacc blocks.
I’ve tried adding various combinations of latency in each, however no
permutations have resulted in substantial improvements. The overwhelming
majority of the errors are reported by the Vacc block. I have also tried
replacing the xBlocks Vacc block with the wide_bram_vacc block in 64-bit
mode, however this results in even more timing errors (and without the
option to adjust latencies).
>
>
>
> I’ve tried various combinations of latencies suggested previously on the
mail archive, but again nothing has really given any improvement.
>
>
>
> Am I simply pushing such a large design too hard? Is the only option to
slow it down or is there some strategic method I can adopt to get closer to
timing closure? Suggestions much appreciated!
>
>
>
> Thanks
>
> Michael

Reply via email to