Re: Best way to carry on 2-input architecture?

2014-08-20 Thread Niels Möller
"Wesley W. Terpstra" writes: >> But maybe you can do a circuit for mulhi which is significantly simpler >> than for a widening multiplication, because you only need to collect the >> carry out from the low partial products? Something like a Wallace tree >> where you drop the low output bit from a

Re: Best way to carry on 2-input architecture?

2014-08-20 Thread Niels Möller
"Wesley W. Terpstra" writes: > I'm going to retract my out-of-hand rejection of this idea. Depends on the size, I guess. For 64x64, my example scheme would reduce the number of and gates from 4096 to about half. Maybe that's not worth the effort, since there's also going to be quite a lot of add

Re: Best way to carry on 2-input architecture?

2014-08-20 Thread Wesley W. Terpstra
On Wed, Aug 20, 2014 at 10:59 AM, Niels Möller wrote: > One could build it out of smaller blocks, say we have combinatorial 8x8 > multipliers. We then need 64 of those, in sequence or parallel, and a > bunch of adders. Or we could use Karatsuba, to do a 64x64 using 27 8x8 > multiplies and a bunch

Re: Best way to carry on 2-input architecture?

2014-08-20 Thread Wesley W. Terpstra
On Wed, Aug 20, 2014 at 10:59 AM, Niels Möller wrote: > I've been thinking that if you want the full two-word result of a > multiply, and compute each part separately using mulhi and mullo, the > mulhi operation will have to compute the full product anyway, discarding > the low half, and then the

Re: Best way to carry on 2-input architecture?

2014-08-20 Thread Niels Möller
"Wesley W. Terpstra" writes: > On Sun, Aug 17, 2014 at 5:05 PM, Torbjörn Granlund wrote: >> For multiply, d = (a * b + c) mod B (B being the word base) and d = [(a >> * b + c) / B] are very useful. > > I see the benefit to the mulhadd variant (=> no hardware division). > I'm not convinced by the