Re: quicksort and compiler optimization

Adrian Hey Fri, 08 May 1998 08:45:14 +0100 (BST)
On Thu 07 May, S. Alexander Jacobson wrote:
> On 07-May-1998, Ralf Hinze <[EMAIL PROTECTED]> wrote:
> > > o it traverses the list twice,
> > > o it has a space leak (xs is alive until the second traversal is done),
> 
> Is it necessary that the compiler do this?  It strikes me that when the
> compiler encounters a function that requires multiple comprehensions of
> the same list, it should be smart enough to consolidate them into a single
> loop (which would also avoid the space leak).  Is my intuition correct
> here?

My intuition tells me the same comment could generally be made for any
pair of expressions that have common sub expressions. In a strict language
such potential optimisations often seem to crop up. I'm not sure what the
implications of lazy semantics are for any implementation of such
optimisations, but I'm sure I could work it out given enough time :-)

> 
> > > o it uses (++) to catenate the results of the recursive calls (note that
> > >   (++) takes time proportional to the length of its first argument).
> 
> This also seems wierd.  Concatenation is a frequent enough operation on
> lists.  Give such frequency, shouldn't the internal representation of a
> list include a pointer to its last element just to ensure that
> concatenation is constant time?  I realize that you can't point to the end
> of a lazy list, but concatenation is not relevant to lazy lists.

If I understand your suggestion properly, this would be very difficult to
implement in a lazy language, because the tail of a list may be (or contain)
a reducible expression (of type List x), not an actual List. (Did I say
that right?). Also, destructive update of the last 'Cons' cell in a
(non empty) list would result in loss of referential transparency, and would
be unsafe unless the compiler could verify that the list was unique.
(I can think of several more reasons why this approach to implementing
++ would be frought with problems, but I'll refrain from boring you with
the details.)

My own opinion is that despite its apparent triviality (to someone used
to non-functional languages) ++ should be avoided if at all possible
because it is quite an expensive operation. Especially so in a lazy
language, because expressions such as xs++ys where ys is empty cannot
be immediately simplified to xs. Instead (the 'Cons cells' of) xs will be
completely duplicated before ys is reduced (and its is discovered
that ys == []).
 
I suppose you could define your own version of ++ to avoid this
problem..

infixr 5 ++!
(++!) :: [a] -> [a] -> [a]
xs ++! [] = xs
xs ++! ys = xs ++ ys

but because this is now strict in its second argument, it won't always work
as a substitute for ++. The obvious example which comes to mind being..

cycle1 xs = let circle = xs ++! circle
            in  circle

This will now get stuck in an endless loop, whether or not xs == [].

Regards
-- 
Adrian Hey
Re: quicksort and compiler optimization

Reply via email to