You might want to configure your mailer to automatically break lines at column 75.
On 8/14/07, Farhan Mohamed Ali <[EMAIL PROTECTED]> wrote: > Speaking of multipliers, i was wondering what speed we are targeting for the > floatmult25 or the entire FPGA in general? I have managed to get a 3 stage > version working at <9ns according to the tools (targeted for 3S1500 but i'm > not sure how accurate the auto generated timing constraints are. Seems a bit > quirky to me, it does not respond as i expect to changes i make (as in, why > do changes i make in stage1 affect the critical path which is in stage2, and > weird stuff like that). I'm used to working on full custom ASICs where i > control everything, so i don't find the synthesizer to be very intuitive :\ > It's trying to do the P&R automatically, and it's using simulated annealing to do it. It's an optimization problem using randomization. So to begin with, what you get isn't deterministic. But since there's competition for resources, changing something in one place will affect everything else. It can be frustrating sometimes. We find that when we're on the edge of being able to meet timing, we'll have to run P&R several times before it gives us what we want. > Back to the XP10, if it doesn't have hard multipliers, we can make our own :) > But again i'm not sure how well that works out for FPGAs. > Yeah. It's not worth the extra logic to fully pipeline it (nor could we keep a 32-stage multiplier pipeline fully fed). We could have separate logic that would run in parallel. Or we could have special mult-stepping instructions. With the latter, partial multiplies can be optimized to take fewer cycles. Here's what I'm thinking.... If we had a stand-alone multstep instruction, it would need four operands: (1) an accumulator, (2) the mutiplicand, (3) one bit from the multiplier to determine whether or not the multiplicand is added to the accumulator, and (4) a loop counter from which to compute the multiplicand left-shift and which bit to take from the multiplier. Now, I don't like the idea of adding extra state. What if we want to add the ability to handle interrupts? But we can tinker with the idea: Have one special instruction whose job is to load the counter and the multiplier. The step instruction would have the accumulator (as a source and the target) and the muliplicand. Each step would step the counter, shift the multiplier, and add (or not, depending on the bit from the multiplier) the shifted multiplicand to the accumulator. That puts a shifter in line with an adder, though, so maybe we want to load the multiplier and multiplicand in the first instruction (so they're shifted 1 each cycle) and then specify the counter and accumulator in the step? We'll have to work out the permutations. BTW, the only reason to do this is because without it, a multiply would take at least 4 times longer due to the overhead of explicit shifts and branches. Do we care? -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
