> -----Original Message----- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Thursday, December 15, 2005 6:24 PM > To: Dann Corbit > Cc: Qingqing Zhou; Greg Stark; Jim C. Nasby; Luke Lonergan; Neil Conway; > Bruce Momjian; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Which qsort is used > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > Radix sort can also be made completely generic by using a callback > > function. > > The function gives back n-bits at a time, from the most significant bits > > to the least significant. > > That's mighty handwavy --- it assumes that the datatype permits a simple > breakdown into small pieces that can be sorted lexicographically.
It's not so hard. For fixed length character strings, the mapping is just the character. For integers the mapping is obvious [msb to lsb or lsb to msb of the integer, depending on whether you are doing msd or lsd radix sort]. For intel floating point, the transformation is: #include <assert.h> #include "inteltyp.h" uint32 float2key(float f) { uint32 sign, mant, mask; assert(sizeof(float) == sizeof(uint32)); mant = *(uint32 *) & f; /* Load float as array of bits */ sign = mant & SB_MASK32; /* Isolate the leading sign bit */ mant ^= SB_MASK32; /* Invert the sign bit, making + > - */ mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */ mant ^= mask; /* Invert exp and mant if negative */ return mant; } uint64 double2key(double d) { uint64 sign, mant, mask; assert(sizeof(double) == sizeof(uint64)); mant = *(uint64 *) & d; /* Load float as array of bits */ sign = mant & SB_MASK64; /* Isolate the leading sign bit */ mant ^= SB_MASK64; /* Invert the sign bit, making + > - */ mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */ mant ^= mask; /* Invert exp and mant if negative */ return mant; } Where the contents of inteltyp.h are as follows: /* ** Typdefs for Intel formats. ** See keyxfrm.c for usage. */ typedef unsigned long uint32; #define SB_MASK32 0x80000000UL #ifdef _MSC_VER typedef unsigned __int64 uint64; typedef __int64 sint64; #define SB_MASK64 0x8000000000000000ui64 #else typedef unsigned long long uint64; typedef long long sint64; #define SB_MASK64 0x8000000000000000ULL #endif extern uint32 float2key(float f); uint64 double2key(double d); ======================================================= After the above transformation, you just use the same buckets as for integers. In general, the creation of the mapping function is no more difficult than the creation of a comparison function. > Seems > to me that's a much stronger requirement than assuming that you can tell > which of two whole values is smaller. What's worse, it needs to be the > same number of pieces for every value, which makes it hard to deal with > variable-length data. No. The number of pieces is irrelevant. And you can use MSD radix sort for variable length data. > > So, for char string, and a radix of 256, you just return the first char, > > then the second char, etc. to get the classical radix sort. > > Uh, no, you'd need to work right-to-left, after having padded all the > strings to the same length somehow. Unless you use MSD radix sort (which is usually better anyway). > regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly