Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh Tue, 22 May 2007 07:18:38 -0700

Tom Lane wrote:
> Okay, I spent some time googling this question, and I can't find any
> suggestion that any ARM variant uses non-IEEE-compliant float format.
> What *is* real clear is that depending on ARM model and a run time (!)
> CPU endianness flag, there are three or four different possibilities
> for the endianness of the data, including a PDP-endian-like alternative
> in which the order of the high and low words is at variance with the
> order of bytes within the words.  (Pardon me while I go vomit...)
>   
Welcome to the wonderful world of embedded CPUs. These buggers will do
ANYTHING, and I do mean anything, in order to squeeze a little more
performance with a little less power consumption, while keeping the end
price tag under 10$. The ARM9, for example, can switch, on the fly,
between 32 and 16 bit machine language in order to save a few bytes in
code size and gain a few MIPS in execution speed.


As an amusing side note, I have heard a claim that the only reason we
need endianity at all is because the Europeans didn't understand that
Arabic is written from right to left. In Arabic you read "17" as "seven
and ten", which means that it is already little endian. Just one
request, please don't quote this story without also mentioning that this
story is wrong, and that 1234 is said, in Arabic, as "one thousand two
hundred four and thirty".

Mixed endianity is usually relic of a 16bit processor that was enhanced
to 32bit. The parts that were atomic before would be big endian, but the
parts that the old CPU required to do in separate operations are stored
low to high.
> So
> I would concur with a patch that ensures that this is what happens
> on the different ARM variants ... though I'll still be interested
> to see how you make that happen given the rather poor visibility
> into which model and endianness we are running on.
>   
You do it semantically. Attached is the outline for the code (I can form
a patch only after we agree where it should go)
I should note a few things:
On IEEE platforms, the code will, of course, translate to/from the same
format. This can be verified by the dump at the end.
I have tested the code on several numbers, and it does work for normal
and for denormalized numbers. I have not tested whether the detection
whether we should generate one or the other actually works, so there may
be an off by one there.
The are a few corner cases that are not yet handled. Two are documented
(underflow and rounding on denormalized numbers). There is one
undocumented, of overflow.
The IEEE -> native code is not yet written, but I think it should be
fairly obvious how it will look once it is.
There is also a function in the code called "calcsize". It's the
beginning of a function to calculate the parameters for the current
platform, again, without knowing the native format. I was thinking of
putting it in the "configure" test, except, of course, the platforms we
refer to are, typically, ones for which you cross compile. See below.

Comments welcome.
> PS: Of course this does not resolve the generic issue of what to do
> with platforms that have outright non-IEEE-format floats.  But at the
> moment I don't see evidence that we need reach that issue for ARM.
>   
The code above does detect when the float isn't being precisely
represented by the IEEE float. We could have another format for those
cases, and distinguish between the cases on import by testing its size.
> PPS: I'm sort of wondering if the PDP-endian business doesn't afflict
> int8 too on this platform.
>   
It's likely. I would say that a configure test would be the best way to
test it, but I suspect that most programs for ARM are cross compiled.
I'm not sure how to resolve that. Maybe if there's a way to
automatically test what gets into memory when you let the compiler
create the constant 0123456789abcdef. At least for smaller than 8 bytes,
the "hton" functions SHOULD do the right thing always.

I COULD go back to my source (he's on vacation until Sunday anyways),
but I'll throw in a guess. Since the ARMs (at least the 7 and the 9) are
not 64 bit native, it's compiler dependent. There are two main compilers
for the ARM, with one of them being gcc. That's, more or less, where my
insights into this end.

Shachar

#include <stdio.h>
#include <limits.h>
#include <math.h>
#include <assert.h>

// What type would we be working on?
#if 1

// Double
#define TYPE double
#define FRAC_BITS 52
#define EXP_BITS 11
#define EXP_BIAS 1023

#else

// Float
#define TYPE float
#define FRAC_BITS 23
#define EXP_BITS 8
#define EXP_BIAS 127

#endif

union fp {
   TYPE flt;
   struct {
      unsigned long low;
      unsigned long high;
   } i;
   unsigned long long l;
   struct {
      unsigned long long int frac:FRAC_BITS;
      unsigned long long int exp:EXP_BITS;
      unsigned long long int sign:1;
   } fp;
};

void dumpnum( TYPE n )
{
   union fp val;
   val.flt=n;
   val.fp.sign=0;
   val.fp.exp=0x7ff;
   val.fp.frac=12;

   printf("%g %08x%08x\n", val.flt, val.i.high, val.i.low );
   printf("Sign: %d, exp: %d(%d), frac: %013x\n", val.fp.sign, val.fp.exp, val.fp.exp-EXP_BIAS, val.fp.frac );
}

int calcsize(volatile TYPE v)
{
   /* Find out the current mantissa, exponent and other attributes size */

   /* Find out the number of bits in the mantissa */
   int numbits=0;

   for( v=0; v!=2; numbits++ )
   {
      v/=2;
      v+=1;

      printf("v %.20g\n", v);
   }

   printf("Number of bits in the mantissa: %d\n", numbits );

   return 0;
}

int main()
{
   TYPE v=0;
   assert(__FLT_RADIX__==2); // We can get around this limitation, but not easilly

   union fp res;

   printf("%.20g\n", v);

   // Copy the sign bit over. signbit is only defined in C99
   res.fp.sign=signbit(v)!=0?1:0;

   // Explicitly check for NaN and Infinity. fpclassify is C99 only
   switch(fpclassify(v)) {
   case FP_NAN:
      res.fp.exp=~0;
      res.fp.frac=1;
      break;
   case FP_INFINITE:
      res.fp.exp=~0;
      res.fp.frac=0;
      break;
   case FP_ZERO:
      res.fp.exp=0;
      res.fp.frac=0;
      break;
   default:
      // This is a number. We don't want to assume that denormalized numbers on the platform will also be
      // denormalized in IEEE and vice versa, so we detect it in a different way
      {
         int exp;
         TYPE frac=frexp( v, &exp );

         // Accumolate the mantissa here
         unsigned long long mantissa=0;
         int currentbit=FRAC_BITS-1;

         int denormal=0;

         // Is the number in the denormalized area?
         if( exp<-EXP_BIAS ) {
            denormal=1;
            currentbit-=-(EXP_BIAS-1)-exp;

            if( currentbit<0 )
               // We have an underflow here!
               // XXX What to do?
               return 1;

            res.fp.exp=0;
         } else {
            // We need a leading 1, we have a leading zero
            frac*=__FLT_RADIX__;
            frac-=1;
            exp--;
            res.fp.exp=exp+EXP_BIAS;
         }

         while(frac!=0 && currentbit>=0) {
            frac*=2; // We no longer work with the native radix
            if( frac>=1 ) {
               frac-=1;
               mantissa|=1ll<<currentbit;
            }

            currentbit--;
         }

         if( frac!=0 ) {
            // We failed to provide an accurate representation. Round the result

            // Will the rounding overflow?
            if( ~mantissa==0 ) {
               mantissa=0; // XXX - bug here in case original is denormalized
               exp++;
            } else {
               mantissa+=1;
            }
         }

         res.fp.frac=mantissa;
      }
      break;
   }

   {
      union fp old;

      old.flt=v;

      printf("Original: sign %d, exp %d, frac %014llx %.20g\n", old.fp.sign, old.fp.exp, old.fp.frac, old.flt );
      printf("New: sign %d, exp %d, frac %014llx %.20g\n", res.fp.sign, res.fp.exp, res.fp.frac, res.flt );
   }


   return 0;
}

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Reply via email to