> -----Original Message-----
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> Sent: Thursday, December 15, 2005 6:24 PM
> To: Dann Corbit
> Cc: Qingqing Zhou; Greg Stark; Jim C. Nasby; Luke Lonergan; Neil
Conway;
> Bruce Momjian; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Which qsort is used
> 
> "Dann Corbit" <[EMAIL PROTECTED]> writes:
> > Radix sort can also be made completely generic by using a callback
> > function.
> > The function gives back n-bits at a time, from the most significant
bits
> > to the least significant.
> 
> That's mighty handwavy --- it assumes that the datatype permits a
simple
> breakdown into small pieces that can be sorted lexicographically. 

It's not so hard.  For fixed length character strings, the mapping is
just the character.  For integers the mapping is obvious [msb to lsb or
lsb to msb of the integer, depending on whether you are doing msd or lsd
radix sort].  For intel floating point, the transformation is:

#include <assert.h>
#include "inteltyp.h"

uint32
float2key(float f)
{
    uint32          sign,
                    mant,
                    mask;

    assert(sizeof(float) == sizeof(uint32));
    mant = *(uint32 *) & f;     /* Load float as array of bits */
    sign = mant & SB_MASK32;    /* Isolate the leading sign bit */
    mant ^= SB_MASK32;          /* Invert the sign bit, making + > - */
    mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */
    mant ^= mask;               /* Invert exp and mant if negative */
    return mant;
}

uint64
double2key(double d)
{
    uint64          sign,
                    mant,
                    mask;

    assert(sizeof(double) == sizeof(uint64));
    mant = *(uint64 *) & d;     /* Load float as array of bits */
    sign = mant & SB_MASK64;    /* Isolate the leading sign bit */
    mant ^= SB_MASK64;          /* Invert the sign bit, making + > - */
    mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */
    mant ^= mask;               /* Invert exp and mant if negative */
    return mant;
}

Where the contents of inteltyp.h are as follows:
/*
** Typdefs for Intel formats.
** See keyxfrm.c for usage.
*/

typedef unsigned long uint32;
#define SB_MASK32 0x80000000UL

#ifdef _MSC_VER
typedef unsigned __int64 uint64;
typedef __int64 sint64;
#define SB_MASK64 0x8000000000000000ui64
#else
typedef unsigned long long uint64;
typedef long long sint64;
#define SB_MASK64 0x8000000000000000ULL
#endif
extern uint32   float2key(float f);
uint64          double2key(double d);

=======================================================
After the above transformation, you just use the same buckets as for
integers.

In general, the creation of the mapping function is no more difficult
than the creation of a comparison function.

> Seems
> to me that's a much stronger requirement than assuming that you can
tell
> which of two whole values is smaller.  What's worse, it needs to be
the
> same number of pieces for every value, which makes it hard to deal
with
> variable-length data.

No.  The number of pieces is irrelevant.  And you can use MSD radix sort
for variable length data.
 
> > So, for char string, and a radix of 256, you just return the first
char,
> > then the second char, etc. to get the classical radix sort.
> 
> Uh, no, you'd need to work right-to-left, after having padded all the
> strings to the same length somehow.

Unless you use MSD radix sort (which is usually better anyway).

>                       regards, tom lane 

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Reply via email to