Speaking of multipliers, i was wondering what speed we are targeting for the 
floatmult25 or the entire FPGA in general? I have managed to get a 3 stage 
version working at <9ns according to the tools (targeted for 3S1500 but i'm not 
sure how accurate the auto generated timing constraints are. Seems a bit quirky 
to me, it does not respond as i expect to changes i make (as in, why do changes 
i make in stage1 affect the critical path which is in stage2, and weird stuff 
like that). I'm used to working on full custom ASICs where i control 
everything, so i don't find the synthesizer to be very intuitive :\ 

Back to the XP10, if it doesn't have hard multipliers, we can make our own :) 
But again i'm not sure how well that works out for FPGAs.

On Mon, August 13, 2007 5:13 pm, Timothy Normand Miller said:
> On 8/13/07, Mark <[EMAIL PROTECTED]> wrote:
>> Timothy Normand Miller wrote:
>>> instructions.  Also, in response to his question, I'm targetting the 
>>> 3S4000 because it's convenient.  In the real design, we'll target the
>>>  XP10, which is a little slower.  Either way, this tells us basically
>>>  what we need to know.
>>> 
>> Doesn't that make this whole discussion moot?  The XP10 doesn't have 
>> hard multipliers, as near as I can tell.  Regardless, the architecture 
>> and timing aren't necessarily even remotely similar (well, sure,
>> they're both island-style FPGAs using 4-LUTs... but that's still a big
>> design space).
> 
> Ugh.  You're right.  I thought it had dedicated multipliers, but Howard
> can't find any reference to that in the spec.
> 
> There's no sense in trying to move the nanocontroller into the Xilinx, 
> because the nanocontroller's also responsibile for controlling DMA.
> 
>> 
>> I'd be wary of putting to much stock in XST's timing estimates, anyhow.
>>  Until you've got post-PAR timing, don't bank on it.
> 
> The timing numbers I provided are post-PAR, but as you say, regarding the
> mulipliers, the point is moot.  We need to rethink that whole thing.
> 
> Should we do some early-SPARC-style multiplier stepping instructions? I'm
> not sure we can without 4-operand instructions.
> 
> Another option would be to switch to the out-of-band approach.  Write 
> operand to the multiplier via the I/O space, and X clock cycles later, you
> can grab the product.
> 
>> Finally, I don't think you mentioned the speed grade and package you're
>>  using for the Spartan or the Lattice part.  (Both are on the board, 
>> right?  I'm going off 
>> http://wiki.duskglow.com/tiki-index.php?page=OGD1+components+guide.)
> 
> For Xilinx, it's -5.  I think we also picked the fastest XP10.
> 
>> Your timing numbers will depend on that, too (even in XST's output, I 
>> believe).  Have those aspects been specified yet?  I mean, at least the
>>  package is presumably known.
> 
> Well, I wanted a ballpark sense of what was worst in the design, and for
> that I doubt it'll make a lot of difference which device we target.  When
> it comes down to shaving off the last few nanoseconds, then it'll matter a
> lot.  What we want is a controller that's reasonably efficient across
> multiple architectures anyhow.
> 
> -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open
> Graphics Project _______________________________________________ 
> Open-graphics mailing list [email protected] 
> http://lists.duskglow.com/mailman/listinfo/open-graphics List service
> provided by Duskglow Consulting, LLC (www.duskglow.com)
> 
> 

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to