Hey Andres, Just a quick follow up, since I've been playing some FFT timing issue games myself --
I don't know which mlib_devel fork you are using, but in some you'll find that rounding is implemented with the casper convert block, with pipelined fabric adder cores. My experience is that whilst these pipelined cores might help meet timing, the compiler does not optimize them effectively. (E.g. there's one convert which essentially involves an adder where one input is zero, but it's still compiled to a logic chain if the adder is implemented in a core). I found that with a couple of 4096pt FFTs, I saved ~6000 slices (on ROACH 2) by switching back to the Xilinx cast blocks. I've made a few mods to the FFT and convert blocks to allow: 1) Your own choice of cast block implementations, using either the casper block in behavioural, Fabric core or DSP core modes, or the Xilinx cast 2) Different choices of implementations in the twiddle blocks and butterfly blocks (whose casts are different sizes). 3) Different choices of adder/convert latencies in the twiddle blocks and butterflies. This allows you to tweak the latencies depending which adders you want to implement in DSP. I've also added a "UCF" yellow block, which allows you to add custom UCF constraints (e.g. those generated by planahead) into a design, by supplying an additional UCF file path within Simulink. On the off chance any of this is useful to you or anyone else, it's in https://github.com/jack-h/mlib_devel Cheers, Jack On 27 November 2013 17:26, Andres Alvear <[email protected]> wrote: > Hello guys, > > I'm so sorry to be confused to explain my project, right now have 2 > spectrometers implemented in the ROACH 1 with the FPGA Virtex-5 > xc5vsx95tff1136-1, with 2 ADC083000 boards. Now answering your question I'm > thinking that i've a lot of failed reports, but all of them have the same > pattern I mean the design have been compiled using logic extensively. The > next are the results from ADC @ 500 MHz and the clock of the FPGA @ 125 MHz > compilation: > > > Device Utilization Summary: > > Number of BUFGs 8 out of 32 25% > Number of DCM_ADVs 4 out of 12 33% > Number of DSP48Es 384 out of 640 60% > Number of ILOGICs 111 out of 800 13% > Number of External IOBs 190 out of 640 29% > Number of LOCed IOBs 190 out of 190 100% > Number of OLOGICs 19 out of 800 2% > Number of RAMB18X2s 103 out of 244 42% > Number of RAMB36_EXPs 80 out of 244 32% > Number of Slices 14584 out of 14720 99% > Number of Slice Registers 53504 out of 58880 90% > Number used as Flip Flops 53504 > Number used as Latches 0 > Number used as LatchThrus 0 > Number of Slice LUTS 44326 out of 58880 75% > Number of Slice LUT-Flip Flop pairs 56278 out of 58880 95% > > Timing summary: > > --------------- > > Timing errors: 0 Score: 0 (Setup/Max: 0, Hold: 0) > Constraints cover 801808 paths, 129 nets, and 158113 connections > > Design statistics: > > Minimum period: 8.332ns (Maximum frequency: 120.019MHz) > Maximum net delay: 2.652ns > > I just have one way to get a successful compilation with two spectrometers > using a 4096 points of pipeline-FFT each without check the option “DSP48E > adders in butterfly” and low latencies like, “add latency=1”, “mult > latency=2”, “BRAMs=2”, “convert latency=1”, “input latency=0” n “latency > between biplexes and fft_direct=0”. The FIR filter work with their complete > stuff. > > > One of my specifics objectives of my project consist in increasing the > bandwidth of each spectrometer from 500 MHz to 1500 MHz without losing > spectral resolution, I mean increasing the number of channels proportionally > to the increase of the bandwidth, so I need to increase from 2048 channels > to 4096 channels too. > > For this reason I go ahead to use PlanAhead using the results showing above. > After floorplaned my design like Ryan showed in his report using plan ahead > to generate a ucf file, which then I placed in data/ system.ucf. and then re > ran the tools for edk ise bitgen. This stage work successful for me. > > My constraints were ok because didn't get timing errors. > > I think that I've a conceptual error because I don't know why the > compilation generate timing groups with this timing constraint. In this part > I need your attention I don't know if I need to delete them or just modify > them putting more timing groups or increasing their speed or something like > that. Instead of configuring a clock pin to connect directly into an > internal clock tree, that pin can be used to drive a special hard-wired > function (block) called a clock manager that generates a number of daughter > clocks. So the logic says us that we need to use the ADC's external signal > clocks to propagate into the whole FPGA using the DCMs to generate daughter > clocks used to drive internal clock trees or output pins. > > How can i do that? I'm going to attach my constraint file to check it out. > > Dan, indeed I want to get Simulink designs to run above 300MHz on a ROACH 1, > so I'm thinking that is almost mandatory to manually constrain placement of > primitives on the FPGA fabric. I really thank Ryan for sending his memo, but > I want to say to him that I'm using it to understand this kind of works > since a couple of weeks. Therefore, I want to say that my goals are: make > the speed optimization to closing timing on an FPGA design with the ADCs > working at 3 GSPS and of course the clock of the FPGA @375 MHz, so I hope to > meet my constraints to clock up to 375 MHz bit by bit using plan ahead. > > Best regards. > > > Cheers! > > Andres > > > 2013/11/26 Jack Hickish <[email protected]> >> >> Hi Andres, >> >> That system.twr you sent me doesn't appear to have any timing errors >> (though it looks like it's compiled for an ADC clock of 500 MHz and >> FPGA clock of 125 MHz). Do you have the report of a failed compile, >> which will give some indication of what parts of the design are >> causing problems. >> >> Your ucf file looks ok -- I don't know how well the pblock constraints >> you have will actually work, but it seems like you've got the right >> idea. The ADC period constraint you have is for a 400 MHz FPGA clock >> -- i.e. a 1600 MHz ADC clock. Is this really what you wanted? Note >> that the constraint is for the clock received by the adc, which is 1/4 >> the rate of the ADC sampling clock. >> >> Cheers >> >> Jack >> >> On 26 November 2013 19:10, Andres Alvear <[email protected]> wrote: >> > Hello Jack I'm Andres Alvear student of Electrical Engineering from >> > Chile. >> > Thanks Jack for your quick answer. I attached the file! >> > >> > I mean that unchecked the options leaving EDK ISE Bitgen and then I >> > re-ran >> > getting a boffile (like the picture). I'm sorry I detected the problem >> > it >> > was in the synthesizer that synchronize the ADCs with their clock rate. >> > I >> > resolve that configuring the Synthesizer setup again. >> > >> > However, I'm happy if you check out my system.twr. and do you know how I >> > can >> > constrains to the system run at 400MHz? What do you think about my >> > Global >> > Timing Constrains? Specifically what are your thoughts about my timing >> > groups that were generated from casper_xps toolflow compilation? Are >> > they >> > all right? >> > >> > I'm not pretty sure if my constraints are working. >> > >> > Cheers! >> > >> > Andres >> > >> > >> > 2013/11/26 Jack Hickish <[email protected]> >> >> >> >> Hey Andres, >> >> >> >> Just because I'm a little confused -- you say you re-ran EDK and got a >> >> boffile, but timing constraints weren't met -- have you disabled the >> >> check for timing closure the toolflow does? Usually you wouldn't get a >> >> successful compile from a design that didn't meet timing. >> >> >> >> If you attach your timing report (implementation/system.twr) I'm happy >> >> to have a look. 54MHz is very slow, which makes me think there's >> >> something not quite right going on.... >> >> >> >> Cheers, >> >> >> >> Jack >> >> >> >> On 26 November 2013 17:04, Andres Alvear <[email protected]> >> >> wrote: >> >> > Thanks Ryan, >> >> > I have just generated my first .bof after re running the tools for >> >> > EDK/ISE/Bitgen successfully but I have not been able to view any >> >> > speed >> >> > optimization so far as you may see in my results in the attached >> >> > picture. >> >> > After the compilation my design ran with a clock rate of 54MHz >> >> > reaching >> >> > 216MHz of Bandwidth on each spectrometer. The results obtained from >> >> > the >> >> > constraint generated in the floorplanning process were introduced in >> >> > the >> >> > "system.ucf' file that was located in the following folders: >> >> > /opt/workspace/spectrometer_dctrl_op/XPS_ROACH_base/data >> >> > /opt/workspace/spectrometer_dctrl_op/XPS_ROACH_base/implementation >> >> > In the "data" folder I removed system.ucf and system.ucf.bac and then >> >> > just >> >> > put my version (with the floorplan) of "system.ucf" in its place. >> >> > Then, >> >> > in >> >> > the "implementation" folder I replaced the "system.ucf" file with my >> >> > version. Finally, I opened the simulink design and then re-ran it >> >> > with >> >> > just >> >> > EDK/ISE/Bitgen. I had a successful compilation with a new .bof file. >> >> > This >> >> > one is working in the ROACH 1. However, my timing constrains were not >> >> > met. >> >> > I'm going to attach my constrains to see if you have some idea the >> >> > possible >> >> > problems. Given my constrain file attached, what values would you put >> >> > in >> >> > the >> >> > constrains so that the system run at 400MHz? What do you think about >> >> > my >> >> > Global Timing Constrains? Specifically what are your thoughts about >> >> > my >> >> > timing groups that were generated from casper_xps toolflow >> >> > compilation? >> >> > Are >> >> > they all right? >> >> > >> >> > Cheers >> >> > >> >> > Andres Alvear >> >> > >> >> > >> >> > >> >> > >> >> > 2013/11/22 Ryan Monroe <[email protected]> >> >> >> >> >> >> Hey Andres, my strategy has generally been to use plan ahead to >> >> >> generate a >> >> >> ucf file, which I then place in data/ system.ucf. then re run the >> >> >> tools for >> >> >> edk ise bitgen. >> >> >> >> >> >> Works consistently for me >> >> >> >> >> >> On Nov 22, 2013 11:31 AM, "Andres Alvear" <[email protected]> >> >> >> wrote: >> >> >>> >> >> >>> Hi everyone, >> >> >>> >> >> >>> I'm working on Speed Optimization with PlanAhead, I've a Simulink >> >> >>> design >> >> >>> of a Spectrometer of 2048-channels and 2 ADCs ADC083000 to 1GSPS in >> >> >>> interleaved mode, and I want to meet a time optimization increasing >> >> >>> the >> >> >>> bandwidth to 1GHz from the actual 500MHz and of course increase the >> >> >>> numbers >> >> >>> of channels at least to 4096, but with the conventional tool flow >> >> >>> is >> >> >>> impossible. >> >> >>> >> >> >>> First thing I told the system I wanted it to go to at 250 MHz, but >> >> >>> my >> >> >>> actual clock rate is about 120MHz too low!! However the system is >> >> >>> working >> >> >>> stable until 125MHz, so I can setup the ADC clock rate to 500MHz to >> >> >>> have >> >> >>> 1GSPS getting a 500MHz of bandwidth to each ADC. >> >> >>> >> >> >>> So I have been working on PlanAhead in a Floorplanning optimization >> >> >>> the >> >> >>> hardware implemented in the FPGA Virtex-5 SX95T, but after make the >> >> >>> floorplanning edit my constraint file like Ryan Monroe say in his >> >> >>> last >> >> >>> memo. >> >> >>> I got a 23% of Speed optimization from 120MHz to 148MHz, but I need >> >> >>> meet >> >> >>> time at least to 200MHz. However I have problems generating >> >> >>> functional >> >> >>> borph >> >> >>> executables, and I'm hoping someone can help me figure out why. >> >> >>> Since >> >> >>> I'm >> >> >>> targeting high speeds. This one is the error from Borph when I try >> >> >>> to >> >> >>> run >> >> >>> from a ssh session: >> >> >>> >> >> >>> root@roach:/boffiles# ./system_2.bof >> >> >>> >> >> >>> -bash: ./system_2.bof: Input/output error >> >> >>> >> >> >>> Then in a ipython 2.7 terminal to check if you managed to connect >> >> >>> to >> >> >>> your >> >> >>> ROACH: >> >> >>> >> >> >>> In [9]: fpga.is_connected() >> >> >>> >> >> >>> Out[9]: True >> >> >>> >> >> >>> Let's set the bitstream running using the progdev() command: >> >> >>> >> >> >>> In [10]: fpga.progdev('system_2.bof') <-----------generated from >> >> >>> mkbof >> >> >>> >> >> >>> Out[10]: 'ok' >> >> >>> >> >> >>> See the ROACH and the leds not blinking. I placed these ones to see >> >> >>> the >> >> >>> working of my design, but these both not blinking at all: >> >> >>> led0_sync, >> >> >>> led1_new_acc. >> >> >>> >> >> >>> >> >> >>> >> >> >>> Do you think that I am in the right the way? Does anyone know >> >> >>> something >> >> >>> about these problems? >> >> >>> >> >> >>> >> >> >>> Cheers! >> >> >>> >> >> >>> Andres Alvear >> >> > >> >> > >> > >> > > > >

