I think this makes a lot of sense, and now that you mention it I think
I recall having similar experiences. Distributed RAM involves making
up the coefficient memory, some few Kbytes, out of ~16x1 bit memory
elements, so the address logic has to fan out to all of those tiny
elements. The BRAM (I assume) has the address logic embedded in it.
Generally for large designs I run low on BRAMs so I have to enable
distributed memory for a few blocks and use BRAMs for the rest.
Glenn

On Fri, Nov 7, 2008 at 1:24 PM, John Ford <[email protected]> wrote:
>>> Hi John,
>>>
>>> Do you have the coefficients in slices rather than BRAMs? That might
>>> be contributing to the high address fanout.
>>
>> Yes, the "use distributed memory cells for coeffs" is checked.  I'll try
>> changing it.  It takes forever to build, but I'll start one and leave it
>> go.
>
> This made a significant difference, so much so that I don't trust it.  It
> took only a couple of hours to build, and used *many* fewer slices...
>
> I'll keep you posted.  (btw, the "use distributed memory cells" is the
> default.)
>
>>
>> (note to the list: I sent Henry the mdl file, instead of sending it to the
>> list as it's 4.5 MB.  Let me know if you want a look at it.)
>>
>> John
>>
>>
>>> Thanks,
>>> Henry
>>>
>>> John Ford wrote:
>>>>> Hi John,
>>>>> I've definitely run into the same issue, but haven't had a chance to
>>>>> tackle it yet. It could be a pain, but I would try to hack the design
>>>>> to change the PFB coefficients by one LSB or something in one location
>>>>> so they can't be optimized together. It might also help to try and
>>>>> change whether the coefficient ROMs are implemented in BRAM or
>>>>> distributed memory. If it comes down to it, you may have to add a
>>>>> delay to one section of the PFB so that the coefficients are off by
>>>>> one time step (the optimizer doesn't seem to be smart enough to figure
>>>>> that one out), and then add delays elsewhere to get things back in
>>>>> step later.
>>>>> It seems like there has to be a way to get that optimization turned
>>>>> off, but I haven't figured it out.
>>>>
>>>> Thanks.  These are some good suggestions.  The error is in the counter
>>>> that addresses the coefficients, so maybe I can do as Jason suggests
>>>> and
>>>> add in some delay between the counter and the rest of the system inside
>>>> the PFB block.  Maybe I can trick it into thinking the counters are
>>>> different, somehow...
>>>>
>>>> John
>>>>
>>>>
>>>>
>>>>> Glenn
>>>>>
>>>>> On Thu, Nov 6, 2008 at 10:07 AM, John Ford <[email protected]> wrote:
>>>>>>> Create more coefficient generators and sprinkle them around inside
>>>>>>> the
>>>>>>> PFB?
>>>>>>>
>>>>>>> Sorry, I don't have Simulink with me, so can't offer more concrete
>>>>>>> advice. But if the problem is with the fanout, why not create
>>>>>>> seperate
>>>>>>> sources with smaller fanouts?
>>>>>> There are separate coefficient generators for every path in the
>>>>>> library
>>>>>> block, but they seem to get optimized out during the build.  That's
>>>>>> the
>>>>>> trick, I guess, that I don't know how to get around...
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> John
>>>>>>
>>>>>>> Jason
>>>>>>>
>>>>>>>
>>>>>>> On 06 Nov 2008, at 15:46, John Ford wrote:
>>>>>>>
>>>>>>>> Does anyone have any suggestions for fixing this error?  It seems
>>>>>>>> due
>>>>>>>> entirely to the large fanout (450!) in the coefficient generator.
>>>>>>>>
>>>>>>>> We're stuck on this for building our 4K channel 800 MHz pulsar
>>>>>>>> machine.
>>>>>>>> It is introducing errors into the spectrum.
>>>>>>>>
>>>>>>>> John
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi all.  Here's a timing error that appears when building a large
>>>>>>>>> (2^^3
>>>>>>>>> parallel paths, 2^^13 points, 4 taps 7 bit coefficients, 7 bit
>>>>>>>>> input, 14
>>>>>>>>> bit output) PFB.  The timing wizard's suggestions are below.  Note
>>>>>>>>> that
>>>>>>>>> the coeff. generator counter seems to be used for all coeff
>>>>>>>>> generator
>>>>>>>>> counters in the entire PFB's complement of coeff generators.
>>>>>>>>>
>>>>>>>>> How can I tell the toolset not to optimize out all the other
>>>>>>>>> counters, or
>>>>>>>>> alternatively, how can I introduce a delay to fix this problem?
>>>>>>>>> I've
>>>>>>>>> attached the twx file, but from the timing wizard:
>>>>>>>>>
>>>>>>>>> This path has a net
>>>>>>>>> "b2_gdsp_u1_8k_800_a_nb_xsg_core_config/
>>>>>>>>> b2_gdsp_u1_8k_800_a_nb_xsg_core_config/
>>>>>>>>> top/pfb_fir_real_4k_ch/pol1_in1_first_tap/pfb_coeff_gen/counter_q(4)"
>>>>>>>>> with a high fanout of 450.
>>>>>>>>>
>>>>>>>>> High fanout suggestions.
>>>>>>>>>
>>>>>>>>> -Duplicate the net source and direct the synthesis tools not to
>>>>>>>>> remove
>>>>>>>>> duplicate logic.
>>>>>>>>>
>>>>>>>>> -Use a specific net fanout control on the problem net if allowed
>>>>>>>>> by
>>>>>>>>> the
>>>>>>>>> synthesis tool.
>>>>>>>>>
>>>>>>>>> Click Next button if this suggestion does not help.
>>>>>>>>>
>>>>>>>>> Thanks for any help!
>>>>>>>>>
>>>>>>>>> John
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
>
>
>

Reply via email to