On Sat, August 18, 2007 9:44 pm, Farhan Mohamed Ali said: > On Sat, August 18, 2007 8:03 pm, Timothy Normand Miller said: >> On 8/18/07, Farhan Mohamed Ali <[EMAIL PROTECTED]> wrote: >>> Attached is the radix-4 multiplier. Since it was easy to make it >>> signed, i just went with that. Adding support to select >>> signed/unsigned is also easy. Can someone with the Lattice tools try >>> synthesizing this? I don't have it installed on my laptop as i'm >>> running out of space. On xilinx i get just under 7.2ns, which is the >>> delay through the 33 bit adder/subtracter. Takes 17 cycles to >>> complete a 32x32 multiply. >> >> This is cool stuff. For one thing, I think I need to read up on some >> of the Verilog 2001 syntax. I learned Verilog in 1999, so I'm a bit >> behind the times and could benefit from some things that would at least >> save some typing. >> >> Anyhow, what I think would be fun is to try out a variety of designs >> and compare them. Different approaches will take different numbers of >> clock cycles, require different amounts of logic area, and have >> different maximum clock rates. A wide exploration of this space could >> be of academic interest. Perhaps a journal would be interested in a >> submission on this. Or perhaps we're repeating work already done, but >> it still might be nice to offer some reference implementations under >> GPL and/or LGPL with known characteristics for certain FPGAs. >> >> On the other hand, we should avoid getting TOO distracted. The >> nanocontroller is something someone's bound to want to incorporate >> into another design, and of course, it's in our main path for OGA1. >> > I could try a radix-8 version, which will further cut down the number of > clock cycles to 9 or 10. It should not take much more hardware. The > adder will have to be a bit longer though (about 35 bits i think), and > the lookup table will be larger, so that will increase the critical path > delay but hopefully only slightly. > I thought about it some more and i think i made a mistake in my earlier estimation. Radix-8 shifts 3 bits at a time so it would take at least 11 cycles to complete. Add one or 2 more cycles for set-up, which is more complicated since the partial product 3a will have to be generated. Also needs an extra 34bit register to store 3a. Not worth the trouble increasing the hardware used and critical path just to save 4 more cycles over radix-4 IMO.
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
