In response to the lack of multiplier logic on the XP10, I've considered
a completely serial multiplier. Now, we already have some good ideas in
this thread, esp the radix-4 sounds promising, but in the spirit of
investigating alternatives:
The attached multiplier takes two 16 bit inputs as operands. One of
them is processed serially. It can be used in two ways: Either, we run
it 32 cycles to extract the final result serially, or better: We can run
it 16 cycles and post-process the output with the ALU adder as indicated
by the attached test module.
I think the serial multiplier has some nice properties: All internal
connections are very local. Further, it only has one level of LUTs if
we can eliminate the "start"-condition; either by using reset logic if
that's allowed, or by making sure the internal state goes to zero before
issuing a new multiply. This is also my main question to the experts:
Given such short and local chains of combinatorics, could it be possible
to run it at twice the clock speed of the CPU? That could give us an 8
cycle 16x16->32 multiplier quite cheaply in terms of gate count.
If we choose to post-process the result with the ALU adder, we may as
well make the parallel-processed operand 32 bit. I haven't looked too
closely it this, but I think it's possible to reuse the same logic for
8x32, 16x32, and 32x32 by sampling the result after 8, 16, or 32 cycles,
rsp, and do simple bit-shifts of the partial result before we feed it to
the adder.
module mul16x16ser_helper(clock, start, x, y, za_o, zb_o);
input clock, start;
input[15:0] x, y;
output[31:0] za_o;
output[15:0] zb_o;
reg[15:0] x_r;
reg[14:0] y_r;
reg[14:0] s;
reg[15:0] c;
reg[15:0] v;
integer i;
always @(posedge clock) begin
if (start) begin
x_r <= x;
y_r <= y[15:1];
s <= x[15:1] & {15{y[0]}};
c <= 0;
v <= {x[0], 15'b0};
end else begin
{c[0], v} <= {(x_r[0] & y_r[0]) + c[0] + s[0], v[15:1]};
for (i = 1; i < 15; i = i + 1)
{c[i], s[i - 1]} <= (x_r[i] & y_r[0]) + c[i] + s[i];
{c[15], s[14]} <= (x_r[15] & y_r[0]) + c[15];
y_r <= {1'b0, y_r[14:1]};
end
end
assign za_o = {s[14:0], v};
assign zb_o = c;
endmodule
module test();
reg clock, start;
reg[15:0] x, y;
wire[31:0] za;
wire[15:0] zb;
always #5 clock <= !clock;
mul16x16ser_helper mul(clock, start, x, y, za, zb);
wire[31:0] z = za + {zb, 16'b0};
initial begin
$monitor("%d za = %d, zb = %d, z = %d", $time, za, zb, z);
clock <= 1;
start <= 1;
x <= 60000;
y <= 60002;
#1000;
start <= 0;
#150;
$finish;
end
endmodule
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)