[Qemu-devel] Issues in conversion to half precision number.

Gaurav Sharma Mon, 11 Aug 2014 02:26:30 -0700

Hi,
While trying conversion of single precision float value to half precision
value for ARM, it seems the code generates incorrect values in some of the
scenarios :


"inline uint32_t perform_round16(iss_info *iss, uint32_t sign, int16_t exp,
uint32_t frac, FPRounding rounding)"

[Case 1]
1. From ARM specs overflow_to_inf is true and result is an overflow
condition.
if N != 16 || fpcr.AHP == '0' then // Single, double or IEEE half precision
    if biased_exp >= 2^E - 1 then
      result = if overflow_to_inf then FPInfinity(sign) else
FPMaxNormal(sign);
      FPProcessException(FPExc_Overflow, fpcr);
      error = 1.0; // Ensure that an Inexact exception occurs

In qemu, we always return the value as :
>> return packFloat16(zSign, 0x1f, 0);
In case overflow_to_inf is false we need to return FPMaxNormal which is :
>> return float_num16(sign, 0x1e, 0x3ff);

[Case 2]
1. From ARM specs :
if round_up then
int_mant = int_mant + 1;
if int_mant == 2^F then // Rounded up from denormalized to normalized
biased_exp = 1;
if int_mant == 2^(F+1) then // Rounded up to next exponent
biased_exp = biased_exp + 1; int_mant = int_mant DIV 2;

result = sign : biased_exp<N-F-2:0> : int_mant<F-1:0>;

[QEMU]
if (exp < -10) {
        return float_num16(sign, 0, 0);
 }

The incremented round up value seems to be lost in this scenario.

Kindly, let me know in case more data points are required.

Thanks,
Gaurav

[Qemu-devel] Issues in conversion to half precision number.

Reply via email to