On 2007-08-16, Timothy Normand Miller wrote:
> On 8/16/07, Petter Urkedal <[EMAIL PROTECTED]> wrote:
> > There was a couple of bugs in the code I submitted, so here is a fixed
> > version which is also extended to 32x16->48 multiply.  It runs in 16
> > cycles to produce a partial result which is fixed up with a 32 bit add.
> 
> Well, it LOOKS nice, but I don't understand it.  That's purely my own
> ignorance.  Would you mind breaking it down and explaining it in
> detail?
> 
> This reminds me a bit of this really fast counter I designed.  Does
> the meaning of the product bits spread out over time?  Or is
> everything meaningful every cycle?

Of course, I shouldn't leave you and others to decipher the code.  I
kind of assumed there were not many simple solutions to the problem of
implementing a serial adder, so this was a textbook case, but after a
bit of searching on the net, there seems to be some options.  But, the
others I've seen don't have that good locality.  So, to the
explanation...

The second "Serial by Parallel Booth Multipliers" in
http://www.andraka.com/multipli.htm is basically the same wiring as my
code, the difference being that we read out the result early and
post-process it with the ALU.

Consider x and y to be vectors of bits, and let |x| and |y| be their
sizes.  Then draw a grid of |x| rows and |y| columns.  Label them
starting with (x0, y0) on the upper left corner.  Now, note that each
point in the grid corresponds to a 1-bit product.  Points which fall on
the same diagonal line in direction upwards-to-the-right have the same
exponent.

Now, the multiplier processes the grid column by column from left to
right.  The s vector in the code is the current sum.  The c is the carry
bits.  Note that as the multiplier switch to the next column to the
right, the exponent corresponding to a given cell of the hardware
increases by one.  Therefore each cell is wired as

            {c[i], s[i - 1]} <= (x_r[i] & y_r[0]) + c[i] + s[i];

where we see that a given s-bit transferred to the next lower cell, thus
being propagated "along" the aforementioned diagonal, whereas the carry
bit stays on the current cell, which on the next cycle has 1 higher
exponent.  The lowest s-bit is shifted into the v vector which forms the
lower 16 bits.

When the multiplier has finished the last column, we could carry on
until all bits in s and c are zero, shifting the full result into v, but
it's it's straight forward to see that the only work left is to add the
carry bits to s which yields the upper 32 bits of the result.
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to