On 2007-08-16, Timothy Normand Miller wrote: > On 8/16/07, Petter Urkedal <[EMAIL PROTECTED]> wrote: > > There was a couple of bugs in the code I submitted, so here is a fixed > > version which is also extended to 32x16->48 multiply. It runs in 16 > > cycles to produce a partial result which is fixed up with a 32 bit add. > > Well, it LOOKS nice, but I don't understand it. That's purely my own > ignorance. Would you mind breaking it down and explaining it in > detail? > > This reminds me a bit of this really fast counter I designed. Does > the meaning of the product bits spread out over time? Or is > everything meaningful every cycle?
Of course, I shouldn't leave you and others to decipher the code. I kind of assumed there were not many simple solutions to the problem of implementing a serial adder, so this was a textbook case, but after a bit of searching on the net, there seems to be some options. But, the others I've seen don't have that good locality. So, to the explanation... The second "Serial by Parallel Booth Multipliers" in http://www.andraka.com/multipli.htm is basically the same wiring as my code, the difference being that we read out the result early and post-process it with the ALU. Consider x and y to be vectors of bits, and let |x| and |y| be their sizes. Then draw a grid of |x| rows and |y| columns. Label them starting with (x0, y0) on the upper left corner. Now, note that each point in the grid corresponds to a 1-bit product. Points which fall on the same diagonal line in direction upwards-to-the-right have the same exponent. Now, the multiplier processes the grid column by column from left to right. The s vector in the code is the current sum. The c is the carry bits. Note that as the multiplier switch to the next column to the right, the exponent corresponding to a given cell of the hardware increases by one. Therefore each cell is wired as {c[i], s[i - 1]} <= (x_r[i] & y_r[0]) + c[i] + s[i]; where we see that a given s-bit transferred to the next lower cell, thus being propagated "along" the aforementioned diagonal, whereas the carry bit stays on the current cell, which on the next cycle has 1 higher exponent. The lowest s-bit is shifted into the v vector which forms the lower 16 bits. When the multiplier has finished the last column, we could carry on until all bits in s and c are zero, shifting the full result into v, but it's it's straight forward to see that the only work left is to add the carry bits to s which yields the upper 32 bits of the result. _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
