If it's not something like 2x or 1/2x or other power of two, it's basically impossible to do in the middle of a pipeline. Also, keep in mind that we reuse this for integer left shift and integer multiply, so we'll keep it relatively busy, and we can also clock-gate.
On Mon, Jan 14, 2013 at 5:04 PM, Troy Benjegerdes <[email protected]> wrote: > When I start thinking about bits per joule, (or multiplies per joule), I > start wondering if we can run the multiplier(s) on a separate clock from > everything else, and be able to scale the speed up and down depending on > some software algorithms that know if this particular multiply is in the > critical path for some other computation, or if it's just a bulk-parallel > multiply where total energy matters more than time-to-answer? > > > On Mon, Jan 14, 2013 at 10:46:03AM -0500, Timothy Normand Miller wrote: > > Where I have used these, the worst part is the wire delay from logic to > the > > multiplier block and back again. I have often had to add extra registers > > in inputs and outputs just to get rid of those delay bottlenecks. > > > > > > On Sun, Jan 13, 2013 at 7:17 PM, Andr? Pouliot <[email protected] > >wrote: > > > > > The multiplier block in FPGA are rather fast, so running them at twice > or > > > 4 time the clock speed could be possible. In an asic they would > actually > > > slow down the design because of the logic depth. > > > > > > > > > > > > On 2013-01-13 18:52, Timothy Normand Miller wrote: > > > > > >> The multipliers are probably going to be the biggest performance > > >> bottleneck in the design. Depending on what blocks are available we > might > > >> be able to pipeline it more deeply in order to get higher frequency. > As it > > >> is, it's fully pipelined at whatever frequency a 18x18 multiplier will > > >> allow. > > >> > > >> > > >> On Sun, Jan 13, 2013 at 5:31 PM, "Ing. Daniel Rozsny?" < > > >> [email protected] <mailto:[email protected]>> wrote: > > >> > > >> I know that this is a generic multiplier, but in practice, would > > >> that map 1:1 to logic gates, or would it be possible to multiply > > >> the i/o frequency locally by 4 times (e.g. 1GHz -> 4GHz) to > > >> achieve a one clock delay multiply? > > >> > > >> Daniel > > >> > > >> > > >> > > >> On 01/13/2013 09:46 PM, Timothy Normand Miller wrote: > > >> > > >> > > >> > > >> > > >> > > >> // TODO: Actually use clock enables > > >> > > >> module four_stage_signed_35x35_**multiply( > > >> input clock, > > >> input [34:0] A, > > >> input [34:0] B, > > >> output reg [69:0] P); > > >> > > >> // Pipeline state 0: Perform all multiplies > > >> wire [35:0] p0a, p2a, p3a; > > >> wire [33:0] p1a; > > >> MULT18X18S mul0 (.C(clock), .CE(1'b1), .R(1'b0), .P(p0a), > > >> .A(A[34:17]), > > >> .B(B[34:17])); > > >> MULT18X18S mul1 (.C(clock), .CE(1'b1), .R(1'b0), .P(p1a), > > >> .A({1'b0, > > >> A[16:0]}), .B({1'b0, B[16:0]})); > > >> MULT18X18S mul2 (.C(clock), .CE(1'b1), .R(1'b0), .P(p2a), > > >> .A(A[34:17]), > > >> .B({1'b0, B[16:0]})); > > >> MULT18X18S mul3 (.C(clock), .CE(1'b1), .R(1'b0), .P(p3a), > > >> .A({1'b0, > > >> A[16:0]}), .B(B[34:17])); > > >> > > >> // Pipeline stage 1: Sum middle terms > > >> reg [35:0] p0b, p2b; > > >> reg [33:0] p1b; > > >> always @(posedge clock) begin > > >> p0b <= p0a; > > >> p1b <= p1a; > > >> p2b <= p2a + p3a; > > >> end > > >> > > >> // Pipeline stage 2: Lower half of final sum > > >> wire [34:0] wlower_a, wlower_b, wupper_a, wupper_b; > > >> assign {wupper_a, wlower_a} = {p0b, p1b}; > > >> assign {wupper_b, wlower_b} = {{17{p2b[35]}}, p2b, > {17{1'b0}}}; > > >> reg [34:0] upper_a, upper_b; > > >> reg [35:0] lower_sum; > > >> always @(posedge clock) begin > > >> lower_sum <= wlower_a + wlower_b; > > >> upper_a <= wupper_a; > > >> upper_b <= wupper_b; > > >> end > > >> > > >> // Pipeline stage 3: Upper half of final sum, with carry in > > >> wire [35:0] upper_sum = {upper_a, 1'b1} + {upper_b, > > >> lower_sum[35]}; > > >> always @(posedge clock) begin > > >> P[34:0] <= lower_sum[34:0]; > > >> P[69:35] <= upper_sum[35:1]; > > >> end > > >> > > >> endmodule > > >> > > >> > > >> // synthesis translate_off > > >> module MULT18X18S( > > >> input C, > > >> input CE, > > >> input R, > > >> output reg [35:0] P, > > >> input [17:0] A, > > >> input [17:0] B); > > >> > > >> wire signed [17:0] a, b; > > >> assign a = A; > > >> assign b = B; > > >> > > >> wire signed [35:0] p; > > >> assign p = a * b; > > >> > > >> always @(posedge C) begin > > >> if (R) begin > > >> P <= 0; > > >> end else > > >> if (CE) begin > > >> P <= p; > > >> end > > >> end > > >> > > >> endmodule > > >> // synthesis translate_on > > >> > > >> > > >> -- > > >> Timothy Normand Miller, PhD > > >> Assistant Professor of Computer Science, Binghamton University > > >> http://www.cs.binghamton.edu/~**millerti/< > http://www.cs.binghamton.edu/~millerti/> > > >> <http://www.cs.binghamton.edu/**%7Emillerti/< > http://www.cs.binghamton.edu/%7Emillerti/> > > >> > > > >> > > >> Open Graphics Project > > >> > > >> > > >> ______________________________**_________________ > > >> Open-graphics mailing list > > >> [email protected] <mailto:Open-graphics@** > duskglow.com<[email protected]> > > >> > > > >> > > >> http://lists.duskglow.com/**mailman/listinfo/open-graphics< > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > >> List service provided by Duskglow Consulting, LLC > > >> (www.duskglow.com <http://www.duskglow.com>) > > >> > > >> > > >> > > >> > > >> > > >> -- > > >> Timothy Normand Miller, PhD > > >> Assistant Professor of Computer Science, Binghamton University > > >> http://www.cs.binghamton.edu/~**millerti/< > http://www.cs.binghamton.edu/~millerti/>< > > >> http://www.cs.binghamton.edu/**%7Emillerti/< > http://www.cs.binghamton.edu/%7Emillerti/> > > >> > > > >> > > >> Open Graphics Project > > >> > > >> > > >> ______________________________**_________________ > > >> Open-graphics mailing list > > >> [email protected] > > >> http://lists.duskglow.com/**mailman/listinfo/open-graphics< > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > >> List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > >> > > > > > > ______________________________**_________________ > > > Open-graphics mailing list > > > [email protected] > > > http://lists.duskglow.com/**mailman/listinfo/open-graphics< > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > > > > > > > > > -- > > Timothy Normand Miller, PhD > > Assistant Professor of Computer Science, Binghamton University > > http://www.cs.binghamton.edu/~millerti/ > > Open Graphics Project > > > _______________________________________________ > > Open-graphics mailing list > > [email protected] > > http://lists.duskglow.com/mailman/listinfo/open-graphics > > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > -- > -------------------------------------------------------------------------- > Troy Benjegerdes 'da hozer' [email protected] > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/) > software & hardware (http://q3u.be) stuff and not get a real job. > Charles Shultz had the best answer: > > "Why do musicians compose symphonies and poets write poems? They do it > because life wouldn't have any meaning for them if they didn't. That's why > I draw cartoons. It's my life." -- Charles Shultz > -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
