Re: quicksort and compiler optimization

Fergus Henderson Fri, 8 May 1998 20:21:54 +1000
On 07-May-1998, S. Alexander Jacobson <[EMAIL PROTECTED]> wrote:
> On 07-May-1998, Ralf Hinze <[EMAIL PROTECTED]> wrote:
> > > o it traverses the list twice,
> > > o it has a space leak (xs is alive until the second traversal is done),
> 
> Is it necessary that the compiler do this?  It strikes me that when the
> compiler encounters a function that requires multiple comprehensions of
> the same list, it should be smart enough to consolidate them into a single
> loop (which would also avoid the space leak).  Is my intuition correct
> here?

This optimization would probably be a win most of the time, so
yes, ideally compilers would do things like that.  Currently however
those sort of optimizations are past the limits of what most Haskell
compilers will do.

Note that consolidating multiple passes into single passes is not
always a win.  For example, if your machine has 10 available registers,
and each pass uses 8 of them, then combining the two passes may mean
that some variables can longer fit in registers, which may lead to
things being less efficient.

> > > o it uses (++) to catenate the results of the recursive calls (note that
> > >   (++) takes time proportional to the length of its first argument).
> 
> This also seems wierd.  Concatenation is a frequent enough operation on
> lists.  Give such frequency, shouldn't the internal representation of a
> list include a pointer to its last element just to ensure that
> concatenation is constant time?

That sort of concatenation is destructive; you can't do that if
there are still references to the original list.
Current Haskell compilers don't do destructive update optimization.

Also, many functional language implementations for simplicity
assume that all values are one word long, and handle values
of more than one word by "boxing" them, i.e. allocating
a record on the heap and representing the value as a pointer
to that record.  For such implementations, multi-word representions
may require a lot more heap allocation.

> I don't know ... what the official name for
> this algorthm is

I think you're talking about "radix sort".

> ... It requires that the items
> being sorted be recursively categorizable
...
> As I write this, I realize I have no idea how to think about this in
> Haskell.  Does the Haskell type system support this?  How do I describe
> this tree data structure and maintain a list of leaves accross the
> datastructure?  

How about using something like the following?

        typeclass Index i => Radixsortable t i where
                decompose :: t -> Maybe (i, t)

        data Bit = Zero | One
        instance Radixsortable Int Bit where
                decompose :: Index i => t -> Maybe (i, t)
                decompose x = if x = 0 then Nothing else (bottom_bit, rest)
                        where bottom_bit = if even x then Zero else one
                              rest = int div 2

        instance Radixsortable String Char where
                decompose s = if s == "" then Nothing else (first_char, rest)
                        where first_char = ...
                              rest = ...

Actually in general what you would really like is to use existential typing,
i.e. something like

        typeclass Radixsortable t where
                decompose :: t -> Maybe (i, [t2])

except that the type variables `i' and `t2' would be existentially
quantified type variables:

                decompose :: t -> Maybe (some i . Index i => i,
                                         [ some t2 . Decomposable t2 => t2 ])

-- 
Fergus Henderson <[EMAIL PROTECTED]>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger [EMAIL PROTECTED]        |     -- the last words of T. S. Garp.
Re: quicksort and compiler optimization

Reply via email to