well, why not clock rates of 0.5x 1x and 2x? (where nominal design for the thermal envelope of the package is 1x, and then whatever is in the critical path can go to 2x if a nearby multiplier is not can go to 0.5x)
I'm thinking this may be serious overengineering unless we have distributed power conversion and voltage regulators to go along with this scheme (the slower clocked areas get lower voltage) All that being said, power-of-2 clock multipliers might be helpful for asic vs fpga design flexibility. On Mon, Jan 14, 2013 at 07:38:26PM -0500, Timothy Normand Miller wrote: > If it's not something like 2x or 1/2x or other power of two, it's basically > impossible to do in the middle of a pipeline. Also, keep in mind that we > reuse this for integer left shift and integer multiply, so we'll keep it > relatively busy, and we can also clock-gate. > > > On Mon, Jan 14, 2013 at 5:04 PM, Troy Benjegerdes <[email protected]> wrote: > > > When I start thinking about bits per joule, (or multiplies per joule), I > > start wondering if we can run the multiplier(s) on a separate clock from > > everything else, and be able to scale the speed up and down depending on > > some software algorithms that know if this particular multiply is in the > > critical path for some other computation, or if it's just a bulk-parallel > > multiply where total energy matters more than time-to-answer? > > > > > > On Mon, Jan 14, 2013 at 10:46:03AM -0500, Timothy Normand Miller wrote: > > > Where I have used these, the worst part is the wire delay from logic to > > the > > > multiplier block and back again. I have often had to add extra registers > > > in inputs and outputs just to get rid of those delay bottlenecks. > > > > > > > > > On Sun, Jan 13, 2013 at 7:17 PM, Andr? Pouliot <[email protected] > > >wrote: > > > > > > > The multiplier block in FPGA are rather fast, so running them at twice > > or > > > > 4 time the clock speed could be possible. In an asic they would > > actually > > > > slow down the design because of the logic depth. > > > > > > > > > > > > > > > > On 2013-01-13 18:52, Timothy Normand Miller wrote: > > > > > > > >> The multipliers are probably going to be the biggest performance > > > >> bottleneck in the design. Depending on what blocks are available we > > might > > > >> be able to pipeline it more deeply in order to get higher frequency. > > As it > > > >> is, it's fully pipelined at whatever frequency a 18x18 multiplier will > > > >> allow. > > > >> > > > >> > > > >> On Sun, Jan 13, 2013 at 5:31 PM, "Ing. Daniel Rozsny?" < > > > >> [email protected] <mailto:[email protected]>> wrote: > > > >> > > > >> I know that this is a generic multiplier, but in practice, would > > > >> that map 1:1 to logic gates, or would it be possible to multiply > > > >> the i/o frequency locally by 4 times (e.g. 1GHz -> 4GHz) to > > > >> achieve a one clock delay multiply? > > > >> > > > >> Daniel > > > >> > > > >> > > > >> > > > >> On 01/13/2013 09:46 PM, Timothy Normand Miller wrote: > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> // TODO: Actually use clock enables > > > >> > > > >> module four_stage_signed_35x35_**multiply( > > > >> input clock, > > > >> input [34:0] A, > > > >> input [34:0] B, > > > >> output reg [69:0] P); > > > >> > > > >> // Pipeline state 0: Perform all multiplies > > > >> wire [35:0] p0a, p2a, p3a; > > > >> wire [33:0] p1a; > > > >> MULT18X18S mul0 (.C(clock), .CE(1'b1), .R(1'b0), .P(p0a), > > > >> .A(A[34:17]), > > > >> .B(B[34:17])); > > > >> MULT18X18S mul1 (.C(clock), .CE(1'b1), .R(1'b0), .P(p1a), > > > >> .A({1'b0, > > > >> A[16:0]}), .B({1'b0, B[16:0]})); > > > >> MULT18X18S mul2 (.C(clock), .CE(1'b1), .R(1'b0), .P(p2a), > > > >> .A(A[34:17]), > > > >> .B({1'b0, B[16:0]})); > > > >> MULT18X18S mul3 (.C(clock), .CE(1'b1), .R(1'b0), .P(p3a), > > > >> .A({1'b0, > > > >> A[16:0]}), .B(B[34:17])); > > > >> > > > >> // Pipeline stage 1: Sum middle terms > > > >> reg [35:0] p0b, p2b; > > > >> reg [33:0] p1b; > > > >> always @(posedge clock) begin > > > >> p0b <= p0a; > > > >> p1b <= p1a; > > > >> p2b <= p2a + p3a; > > > >> end > > > >> > > > >> // Pipeline stage 2: Lower half of final sum > > > >> wire [34:0] wlower_a, wlower_b, wupper_a, wupper_b; > > > >> assign {wupper_a, wlower_a} = {p0b, p1b}; > > > >> assign {wupper_b, wlower_b} = {{17{p2b[35]}}, p2b, > > {17{1'b0}}}; > > > >> reg [34:0] upper_a, upper_b; > > > >> reg [35:0] lower_sum; > > > >> always @(posedge clock) begin > > > >> lower_sum <= wlower_a + wlower_b; > > > >> upper_a <= wupper_a; > > > >> upper_b <= wupper_b; > > > >> end > > > >> > > > >> // Pipeline stage 3: Upper half of final sum, with carry in > > > >> wire [35:0] upper_sum = {upper_a, 1'b1} + {upper_b, > > > >> lower_sum[35]}; > > > >> always @(posedge clock) begin > > > >> P[34:0] <= lower_sum[34:0]; > > > >> P[69:35] <= upper_sum[35:1]; > > > >> end > > > >> > > > >> endmodule > > > >> > > > >> > > > >> // synthesis translate_off > > > >> module MULT18X18S( > > > >> input C, > > > >> input CE, > > > >> input R, > > > >> output reg [35:0] P, > > > >> input [17:0] A, > > > >> input [17:0] B); > > > >> > > > >> wire signed [17:0] a, b; > > > >> assign a = A; > > > >> assign b = B; > > > >> > > > >> wire signed [35:0] p; > > > >> assign p = a * b; > > > >> > > > >> always @(posedge C) begin > > > >> if (R) begin > > > >> P <= 0; > > > >> end else > > > >> if (CE) begin > > > >> P <= p; > > > >> end > > > >> end > > > >> > > > >> endmodule > > > >> // synthesis translate_on > > > >> > > > >> > > > >> -- > > > >> Timothy Normand Miller, PhD > > > >> Assistant Professor of Computer Science, Binghamton University > > > >> http://www.cs.binghamton.edu/~**millerti/< > > http://www.cs.binghamton.edu/~millerti/> > > > >> <http://www.cs.binghamton.edu/**%7Emillerti/< > > http://www.cs.binghamton.edu/%7Emillerti/> > > > >> > > > > >> > > > >> Open Graphics Project > > > >> > > > >> > > > >> ______________________________**_________________ > > > >> Open-graphics mailing list > > > >> [email protected] <mailto:Open-graphics@** > > duskglow.com<[email protected]> > > > >> > > > > >> > > > >> http://lists.duskglow.com/**mailman/listinfo/open-graphics< > > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > >> List service provided by Duskglow Consulting, LLC > > > >> (www.duskglow.com <http://www.duskglow.com>) > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> Timothy Normand Miller, PhD > > > >> Assistant Professor of Computer Science, Binghamton University > > > >> http://www.cs.binghamton.edu/~**millerti/< > > http://www.cs.binghamton.edu/~millerti/>< > > > >> http://www.cs.binghamton.edu/**%7Emillerti/< > > http://www.cs.binghamton.edu/%7Emillerti/> > > > >> > > > > >> > > > >> Open Graphics Project > > > >> > > > >> > > > >> ______________________________**_________________ > > > >> Open-graphics mailing list > > > >> [email protected] > > > >> http://lists.duskglow.com/**mailman/listinfo/open-graphics< > > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > >> List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > >> > > > > > > > > ______________________________**_________________ > > > > Open-graphics mailing list > > > > [email protected] > > > > http://lists.duskglow.com/**mailman/listinfo/open-graphics< > > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > > > > > > > > > > > > > > -- > > > Timothy Normand Miller, PhD > > > Assistant Professor of Computer Science, Binghamton University > > > http://www.cs.binghamton.edu/~millerti/ > > > Open Graphics Project > > > > > _______________________________________________ > > > Open-graphics mailing list > > > [email protected] > > > http://lists.duskglow.com/mailman/listinfo/open-graphics > > > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > > > > -- > > -------------------------------------------------------------------------- > > Troy Benjegerdes 'da hozer' [email protected] > > > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/) > > software & hardware (http://q3u.be) stuff and not get a real job. > > Charles Shultz had the best answer: > > > > "Why do musicians compose symphonies and poets write poems? They do it > > because life wouldn't have any meaning for them if they didn't. That's why > > I draw cartoons. It's my life." -- Charles Shultz > > > > > > -- > Timothy Normand Miller, PhD > Assistant Professor of Computer Science, Binghamton University > http://www.cs.binghamton.edu/~millerti/ > Open Graphics Project -- -------------------------------------------------------------------------- Troy Benjegerdes 'da hozer' [email protected] Somone asked my why I work on this free (http://www.fsf.org/philosophy/) software & hardware (http://q3u.be) stuff and not get a real job. Charles Shultz had the best answer: "Why do musicians compose symphonies and poets write poems? They do it because life wouldn't have any meaning for them if they didn't. That's why I draw cartoons. It's my life." -- Charles Shultz _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
