Re: [casper] Mystery timing errors

Michael D'Cruze Wed, 02 Sep 2015 03:00:05 -0700

Hi Jack,

Thanks for your suggestions! It’s taken me a little while to work through it 
all.

I did manage to get a 2-pol, 32k channel design to compile eventually. I think 
it was a combination of a) sufficient latency in the Vacc and PFB blocks (as 
you suggest) and b) splitting the two pols into two separate chains, rather 
than using one block for the PFB and FFT in biplex mode. It was found by just 
iterating through latency combinations until the timing started to converge, 
but there were still several hundred errors until I split up the design into 
two chains. Now, I don’t really understand why using two separate PFB/FFT 
blocks should make a difference (surely sysgen sorts such detail?), but the 
latency settings were:

PFB: Add Latency 1; Mult latency 2; BRAM latency 3; Fanout latency 1; Convert 
latency 1 (4 taps).
Vacc: Add latency 2; BRAM latency 6 (overkill?); Mux latency 0,

where DSP48 has been used wherever possible in the design.

Now I need to see if it still works if I use more FIR taps ;-)

Best wishes
Michael

From: Jack Hickish [mailto:jackhick...@gmail.com]
Sent: 30 August 2015 21:57
To: Michael D'Cruze
Cc: casper@lists.berkeley.edu
Subject: Re: [casper] Mystery timing errors

Hi Michael,

As one of the Xilinx timing closure documents helpfully articulates, it's very 
difficult to give specific recipes for solving timing problems, other than 
reading the timing report, looking at things in PlanAhead/FPGA Editor, 
iterating compiles and developing some intuition.

That said, there are a few things you could try, of varying levels of 
complexity --
Since you have quite a lot of channels in your design, I suspect your vacc 
problems are related to the size of the bram buffer (especially if your samples 
are wide). Keep in mind that a bram in a Virtex6 is a 36 bit wide x 1k deep 
memory block, so building a vacc to accommodate many thousands of channels 
means banking together lots of brams -- this is liable to cause fanout issues 
on your address/control signals. The same is probably true of the large delays 
in the PFB blocks, though I'm not sure how these are implemented in the various 
block versions which exist in mlib_devel.
Setting brams to optimize for speed and giving them at least 3 cycles latency 
may help. You could also try manually controlling bram control signals -- I 
believe the casper_library_bus/bus_single_port_ram block will do some of this 
for you if you set to optimize for speed and include some fanout latency. If 
you replace the bram delay in the vacc with one of these (and associated 
address counter logic to make the block act as a delay) then perhaps this will 
help. Be careful to get your latencies right so the block still functions 
properly.
You might also find that implementing the adder in a DSP slice (if you haven't 
already) might help.

A more involved option is to go about manually constraining the placement of 
components. You might find setting up pblocks in PlanAhead to place major parts 
of your design might free up the compiler to do a little better with the 
remainder of the logic. Adding pblocks for the pfb, and the various FFT stages 
can be very helpful in this regard. Or perhaps just constraining the vacc 
that's causing you problems might be enough.
Once you have some constraints from planahead which work, you can auto-include 
them in your simulink compiles using various methods, my favourite being the 
'UCF' yellow block, which is in the current casper-astro mlib_devel.

In general, I suspect that you would be well served by trying to reduce your 
bram use in your model - the resource utilisation report can probably either 
confirm or contest that this is the case. You could do this by trying to use 
alternatives to brams where blocks allow it (for example the FFT will allow 
coefficients to be generated using DSPs in some cases), or by reducing the 
number of FIR taps, or implementing something using QDR rather than bram if 
appropriate.

The unfortunate truth is that some, none or all of these suggestions may help, 
or might make things worse. I would say that in my experience over-judicious 
use of pipeline/register stages throughout a design can often do more harm than 
good, and a sensibly placed PFB can result in incredible timing improvements.

Hope that helps, please email back with any more info and/or updates -- I 
suspect if you solve this problem documenting your method may prove very useful 
for others encountering similar problems.

Cheers, and good luck!
Jack

On 30 August 2015 at 10:42, Michael D'Cruze 
<michael.dcr...@postgrad.manchester.ac.uk<mailto:michael.dcr...@postgrad.manchester.ac.uk>>
 wrote:
>
> Hi everyone,
>
>
>
> I’m having quite a lot of problems getting a particular design to compile, 
> XPS consistently reporting timing errors. The design is a wideband 
> spectrometer, somewhat similar to the tutorial 3 design. I’m running an iADC 
> at 1024 MHz, and the FPGA at 256 MHz. It is a two-polarisation design, with a 
> 64k-point PFB and FFT (for 32k channels) in each polarisation. I am running 
> the PFB and FFT blocks in “two-polarisation” mode, so there are just one of 
> each block. I am using the Vacc block from the xBlocks repository.
>
>
>
> I should add at this point that the same design with 16k channels compiles 
> successfully at this speed, and that the 32k channel design compiles at 
> slower speeds (e.g. 200 MHz FPGA).
>
>
>
> The timing reports suggest negative slack in the PFB and Vacc blocks. I’ve 
> tried adding various combinations of latency in each, however no permutations 
> have resulted in substantial improvements. The overwhelming majority of the 
> errors are reported by the Vacc block. I have also tried replacing the 
> xBlocks Vacc block with the wide_bram_vacc block in 64-bit mode, however this 
> results in even more timing errors (and without the option to adjust 
> latencies).
>
>
>
> I’ve tried various combinations of latencies suggested previously on the mail 
> archive, but again nothing has really given any improvement.
>
>
>
> Am I simply pushing such a large design too hard? Is the only option to slow 
> it down or is there some strategic method I can adopt to get closer to timing 
> closure? Suggestions much appreciated!
>
>
>
> Thanks
>
> Michael

Re: [casper] Mystery timing errors

Reply via email to