Consider the use cases and the prevalence of multiplies.
On Tue, Jan 22, 2013 at 1:49 AM, Troy Benjegerdes <[email protected]> wrote: > well, why not clock rates of 0.5x 1x and 2x? (where nominal design for the > thermal envelope of the package is 1x, and then whatever is in the critical > path can go to 2x if a nearby multiplier is not can go to 0.5x) > > I'm thinking this may be serious overengineering unless we have distributed > power conversion and voltage regulators to go along with this scheme (the > slower clocked areas get lower voltage) > > All that being said, power-of-2 clock multipliers might be helpful for > asic vs fpga design flexibility. > > On Mon, Jan 14, 2013 at 07:38:26PM -0500, Timothy Normand Miller wrote: > > If it's not something like 2x or 1/2x or other power of two, it's > basically > > impossible to do in the middle of a pipeline. Also, keep in mind that we > > reuse this for integer left shift and integer multiply, so we'll keep it > > relatively busy, and we can also clock-gate. > > > > > > On Mon, Jan 14, 2013 at 5:04 PM, Troy Benjegerdes <[email protected]> > wrote: > > > > > When I start thinking about bits per joule, (or multiplies per joule), > I > > > start wondering if we can run the multiplier(s) on a separate clock > from > > > everything else, and be able to scale the speed up and down depending > on > > > some software algorithms that know if this particular multiply is in > the > > > critical path for some other computation, or if it's just a > bulk-parallel > > > multiply where total energy matters more than time-to-answer? > > > > > > > > > On Mon, Jan 14, 2013 at 10:46:03AM -0500, Timothy Normand Miller wrote: > > > > Where I have used these, the worst part is the wire delay from logic > to > > > the > > > > multiplier block and back again. I have often had to add extra > registers > > > > in inputs and outputs just to get rid of those delay bottlenecks. > > > > > > > > > > > > On Sun, Jan 13, 2013 at 7:17 PM, Andr? Pouliot < > [email protected] > > > >wrote: > > > > > > > > > The multiplier block in FPGA are rather fast, so running them at > twice > > > or > > > > > 4 time the clock speed could be possible. In an asic they would > > > actually > > > > > slow down the design because of the logic depth. > > > > > > > > > > > > > > > > > > > > On 2013-01-13 18:52, Timothy Normand Miller wrote: > > > > > > > > > >> The multipliers are probably going to be the biggest performance > > > > >> bottleneck in the design. Depending on what blocks are available > we > > > might > > > > >> be able to pipeline it more deeply in order to get higher > frequency. > > > As it > > > > >> is, it's fully pipelined at whatever frequency a 18x18 multiplier > will > > > > >> allow. > > > > >> > > > > >> > > > > >> On Sun, Jan 13, 2013 at 5:31 PM, "Ing. Daniel Rozsny?" < > > > > >> [email protected] <mailto:[email protected]>> wrote: > > > > >> > > > > >> I know that this is a generic multiplier, but in practice, > would > > > > >> that map 1:1 to logic gates, or would it be possible to > multiply > > > > >> the i/o frequency locally by 4 times (e.g. 1GHz -> 4GHz) to > > > > >> achieve a one clock delay multiply? > > > > >> > > > > >> Daniel > > > > >> > > > > >> > > > > >> > > > > >> On 01/13/2013 09:46 PM, Timothy Normand Miller wrote: > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> // TODO: Actually use clock enables > > > > >> > > > > >> module four_stage_signed_35x35_**multiply( > > > > >> input clock, > > > > >> input [34:0] A, > > > > >> input [34:0] B, > > > > >> output reg [69:0] P); > > > > >> > > > > >> // Pipeline state 0: Perform all multiplies > > > > >> wire [35:0] p0a, p2a, p3a; > > > > >> wire [33:0] p1a; > > > > >> MULT18X18S mul0 (.C(clock), .CE(1'b1), .R(1'b0), .P(p0a), > > > > >> .A(A[34:17]), > > > > >> .B(B[34:17])); > > > > >> MULT18X18S mul1 (.C(clock), .CE(1'b1), .R(1'b0), .P(p1a), > > > > >> .A({1'b0, > > > > >> A[16:0]}), .B({1'b0, B[16:0]})); > > > > >> MULT18X18S mul2 (.C(clock), .CE(1'b1), .R(1'b0), .P(p2a), > > > > >> .A(A[34:17]), > > > > >> .B({1'b0, B[16:0]})); > > > > >> MULT18X18S mul3 (.C(clock), .CE(1'b1), .R(1'b0), .P(p3a), > > > > >> .A({1'b0, > > > > >> A[16:0]}), .B(B[34:17])); > > > > >> > > > > >> // Pipeline stage 1: Sum middle terms > > > > >> reg [35:0] p0b, p2b; > > > > >> reg [33:0] p1b; > > > > >> always @(posedge clock) begin > > > > >> p0b <= p0a; > > > > >> p1b <= p1a; > > > > >> p2b <= p2a + p3a; > > > > >> end > > > > >> > > > > >> // Pipeline stage 2: Lower half of final sum > > > > >> wire [34:0] wlower_a, wlower_b, wupper_a, wupper_b; > > > > >> assign {wupper_a, wlower_a} = {p0b, p1b}; > > > > >> assign {wupper_b, wlower_b} = {{17{p2b[35]}}, p2b, > > > {17{1'b0}}}; > > > > >> reg [34:0] upper_a, upper_b; > > > > >> reg [35:0] lower_sum; > > > > >> always @(posedge clock) begin > > > > >> lower_sum <= wlower_a + wlower_b; > > > > >> upper_a <= wupper_a; > > > > >> upper_b <= wupper_b; > > > > >> end > > > > >> > > > > >> // Pipeline stage 3: Upper half of final sum, with carry > in > > > > >> wire [35:0] upper_sum = {upper_a, 1'b1} + {upper_b, > > > > >> lower_sum[35]}; > > > > >> always @(posedge clock) begin > > > > >> P[34:0] <= lower_sum[34:0]; > > > > >> P[69:35] <= upper_sum[35:1]; > > > > >> end > > > > >> > > > > >> endmodule > > > > >> > > > > >> > > > > >> // synthesis translate_off > > > > >> module MULT18X18S( > > > > >> input C, > > > > >> input CE, > > > > >> input R, > > > > >> output reg [35:0] P, > > > > >> input [17:0] A, > > > > >> input [17:0] B); > > > > >> > > > > >> wire signed [17:0] a, b; > > > > >> assign a = A; > > > > >> assign b = B; > > > > >> > > > > >> wire signed [35:0] p; > > > > >> assign p = a * b; > > > > >> > > > > >> always @(posedge C) begin > > > > >> if (R) begin > > > > >> P <= 0; > > > > >> end else > > > > >> if (CE) begin > > > > >> P <= p; > > > > >> end > > > > >> end > > > > >> > > > > >> endmodule > > > > >> // synthesis translate_on > > > > >> > > > > >> > > > > >> -- > > > > >> Timothy Normand Miller, PhD > > > > >> Assistant Professor of Computer Science, Binghamton > University > > > > >> http://www.cs.binghamton.edu/~**millerti/< > > > http://www.cs.binghamton.edu/~millerti/> > > > > >> <http://www.cs.binghamton.edu/**%7Emillerti/< > > > http://www.cs.binghamton.edu/%7Emillerti/> > > > > >> > > > > > >> > > > > >> Open Graphics Project > > > > >> > > > > >> > > > > >> ______________________________**_________________ > > > > >> Open-graphics mailing list > > > > >> [email protected] <mailto:Open-graphics@** > > > duskglow.com<[email protected]> > > > > >> > > > > > >> > > > > >> > http://lists.duskglow.com/**mailman/listinfo/open-graphics< > > > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > > >> List service provided by Duskglow Consulting, LLC > > > > >> (www.duskglow.com <http://www.duskglow.com>) > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> Timothy Normand Miller, PhD > > > > >> Assistant Professor of Computer Science, Binghamton University > > > > >> http://www.cs.binghamton.edu/~**millerti/< > > > http://www.cs.binghamton.edu/~millerti/>< > > > > >> http://www.cs.binghamton.edu/**%7Emillerti/< > > > http://www.cs.binghamton.edu/%7Emillerti/> > > > > >> > > > > > >> > > > > >> Open Graphics Project > > > > >> > > > > >> > > > > >> ______________________________**_________________ > > > > >> Open-graphics mailing list > > > > >> [email protected] > > > > >> http://lists.duskglow.com/**mailman/listinfo/open-graphics< > > > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > > >> List service provided by Duskglow Consulting, LLC ( > www.duskglow.com) > > > > >> > > > > > > > > > > ______________________________**_________________ > > > > > Open-graphics mailing list > > > > > [email protected] > > > > > http://lists.duskglow.com/**mailman/listinfo/open-graphics< > > > http://lists.duskglow.com/mailman/listinfo/open-graphics> > > > > > List service provided by Duskglow Consulting, LLC ( > www.duskglow.com) > > > > > > > > > > > > > > > > > > > > > -- > > > > Timothy Normand Miller, PhD > > > > Assistant Professor of Computer Science, Binghamton University > > > > http://www.cs.binghamton.edu/~millerti/ > > > > Open Graphics Project > > > > > > > _______________________________________________ > > > > Open-graphics mailing list > > > > [email protected] > > > > http://lists.duskglow.com/mailman/listinfo/open-graphics > > > > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > > > > > > > -- > > > > -------------------------------------------------------------------------- > > > Troy Benjegerdes 'da hozer' > [email protected] > > > > > > Somone asked my why I work on this free ( > http://www.fsf.org/philosophy/) > > > software & hardware (http://q3u.be) stuff and not get a real job. > > > Charles Shultz had the best answer: > > > > > > "Why do musicians compose symphonies and poets write poems? They do it > > > because life wouldn't have any meaning for them if they didn't. That's > why > > > I draw cartoons. It's my life." -- Charles Shultz > > > > > > > > > > > -- > > Timothy Normand Miller, PhD > > Assistant Professor of Computer Science, Binghamton University > > http://www.cs.binghamton.edu/~millerti/ > > Open Graphics Project > > -- > -------------------------------------------------------------------------- > Troy Benjegerdes 'da hozer' [email protected] > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/) > software & hardware (http://q3u.be) stuff and not get a real job. > Charles Shultz had the best answer: > > "Why do musicians compose symphonies and poets write poems? They do it > because life wouldn't have any meaning for them if they didn't. That's why > I draw cartoons. It's my life." -- Charles Shultz > -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
