On 7/25/07, André Pouliot <[EMAIL PROTECTED]> wrote:
> Hello,
>
> Here is the first version for the float25 multiplier. The float are
> based on the IEEE-754 specification but with a reduce mantissa to fit
> the hardware on the spartan3. It's still doesn't have a test bench, I
> still need to learn how to do one in verilog and install a simulator.
> But it do pass synthesis for a spartan3 the resource used are 1
> multiplier and 110 flip-flop and 47 LUT. The result post-synthesis are
> what I expected for the logic.

As for simulators, I suggest Icarus because it's free.

Now, do you need help with writing a test bench in terms of knowing
how to write behavioral Verilog?  Or do you need suggestions on how to
come up with test numbers to input?

For the former, I suspect we have a few examples checked into our SVN.
 Otherwise, I can give you something to get started with.  I think the
test environments for the PCI controller and the memory controller are
SVN, and you should be able to use those to figure out how to set up
clocks and stuff.

If it's the latter, I would suggest writing a C program to output
Verilog code.  To start with, I'd write a task in Verilog that took
the inputs and output (coded in hex or whatever).  Have the task set
the inputs to the multiplier and then wait the pipeline length and
then test the output of the multiplier against what you gave it.  So
your test code would look vague like this:

initial begin
   // do reset or whatever
   // ...

   test_mult('h42987, 'hab76346, 'h3697863);
   test_mult('hbfe63547, 'h48957348, 'h23476248);
   // ... more generated code...
end

task test_mult
input [24:0] ina, inb, outc;
begin
    mult_input_a = ina;
    mult_input_b = inb;
    pe; pe; pe; pe;
    if (mult_output != outc) begin
        $display(",... something about a mismatch  ...")
    end
end
endtask

task pe;
begin
    @(posedge clock);
end
endtask


(Note that my numbers are bogus.)



>
> It's a 4 stage multiplier. The input aren't latched before beginning the
> bit manipulation, a supposition is made that the previous module will
> latch is output data.

This is common practice.

>
> First stage is used for verification if the mantissa value is normalized
> or not by testing the exponent. Also in that stage the sign bit is
> calculated and the incoming signal are split in the different part that
> composed them.
>
> Second stage are where the true calculation take place, addition of the
> exponent and multiplication of the mantissa.
>
> Third stage is where depending on the result of the mantissa we
> normalizes the result.  Selection of what part of the mantissa to keep
> and correction of the exponent field, since there is an offset in the
> exponent to compensate for.
>
> Four stage the value are rounded to 0 or infinite, if the exponent fall
> below 1 or is bigger than 254.
>
> The part that could be ameliorated is the 4 stage with the rounding.
> There is no support for how to handle unnormalized number except by
> rounding them to zero. The result of the multiplication can't produce
> NaN or unnormalized number.

This is what we need for the GPU!

More comments below.

>
>
>
> /*-----------------------------------------------------------------------------
> File name : float25Mult.v
> Description : A floating point multiplier base on the Float of IEEE-754
> mantissa is a 16 bits field, Exponents is 8 bits field and 1 sign bit.
> The multiplier produce correct result with normalised value, denormalised 
> value
> are also calculed correctly but the output is not well handled. If the 
> Exponent
> go under zero the value is rounded to zero. If the exponent have a value of 
> 255
> or more the result is rounded to infinite.
>
> Author : André Pouliot
> Created : 2007/05/25
> Modified : 2007/05/25
> -----------------------------------------------------------------------------*/
>
> //module float25 multiplication
> module floatmult25 (
> clk,
> floatA,
> floatB,
> floatResult
> );
>
> //Port definition
> input           clk;
> input[24:0]     floatA;
> input[24:0]     floatB;
> output[24:0]    floatResult;
>
> wire            clk;
> wire[24:0]      floatA;
> wire[24:0]      floatB;
> wire[24:0]      floatResult;

These wires are redundant to the input/output above.

>
> //internal signal
> reg             signStg1;
> reg             normaliseBitA;
> reg             normaliseBitB;
> reg[7:0]        exponentAStg1;
> reg[7:0]        exponentBStg1;
> reg[15:0]       mantissaAStg1;
> reg[15:0]       mantissaBStg1;
>
> reg             signStg2;
> reg[8:0]        exponentStg2;
> reg[33:0]       mantissaStg2;
>
> reg             signStg3;
> reg[9:0]        exponentStg3;
> reg[15:0]       mantissaStg3;
>
> reg             signStg4;
> reg[7:0]        exponentStg4;
> reg[15:0]       mantissaStg4;
>
> //---------------------
> //Begin logic
> //---------------------
>
> //First stage evaluation if value is normalised or not and bit splicing
> //in independant field
> always @(posedge clk)
> begin : Stage1
>   signStg1 <= floatA[24]^floatB[24];
>   exponentAStg1 <= floatA[23:16];
>   exponentBStg1 <= floatB[23:16];
>   mantissaAStg1 <= floatA[15:0];
>   mantissaBStg1 <= floatB[15:0];
>   normaliseBitA <= |floatA[23:16];
>   normaliseBitB <= |floatB[23:16];
> end
>
> //second stage multiplication and addition of the mantissa and exponent
>
> always @(posedge clk)
> begin : Stage2
>   signStg2 <= signStg1;
>   exponentStg2 <= exponentAStg1 + exponentBStg1;
>   mantissaStg2 <= {normaliseBitA,mantissaAStg1}*{normaliseBitB,mantissaBStg1};
> end

At some point, I had gotten confused by the IEEE spec.  I know that
normalized represents 1.mantissa, but is unnormalized 0.mantissa or
(0.mantissa<<1) ?

>
> //Stage 3 mantissa select for reforming the data for next stage
> //and exponent adjust depending on mantissa result.
> always @(posedge clk)
> begin : Stage3
>   signStg3 <= signStg2;
>   if (mantissaStg2[33]) begin
>     exponentStg3 <= exponentStg2 - 126;
>     mantissaStg3 <= mantissaStg2[32:17];
>   end else begin
>     exponentStg3 <= exponentStg2 - 127;
>     mantissaStg3 <= mantissaStg2[31:16];
>   end
> end

I decided to work this out for myself, forgetting unnormalized.  Once
you add the 1 to the number, the largest operand you can get is 1FFFF.
 The smallest is 10000.  So, the largest product is 0x3FFFC0001, which
is 34 bits, and the smallest is 0x100000000, which is 33 bits.  So it
looks like you have it right!

I think perhaps you do more with unnormalized numbers than you need
to.  Have you considered treating them all as zero in input?  You
might eliminate some logic.

>
> //Stage 4 Rounding to zero or infinite before output.
> always @(posedge clk)
> begin : Stage4
>   signStg4 <= signStg3;
>   if (exponentStg3[9] || exponentStg3 == 0) begin//if negatif or zero round 
> to zero
>     exponentStg4 <= 8'h00;
>     mantissaStg4 <= 16'h0000;
>   end else if(exponentStg3[8] || exponentStg3 == 255) begin
>     exponentStg4 <= 8'hFF;
>     mantissaStg4 <= 16'h0000;
>   end else begin
>     exponentStg4 <= exponentStg3;
>     mantissaStg4 <= mantissaStg3;
>   end
> end
>
> assign floatResult[24] = signStg4;
> assign floatResult[23:16] = exponentStg4;
> assign floatResult[15:0] = mantissaStg4;
>
> endmodule
>


-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to