Efficient derived orderings ...?

haskell-request 5 Nov 91 16:57
Original-Via: uk.ac.ox.prg; Tue, 5 Nov 91 16:46:19 GMT

I'm a little bit bothered about the approach to defining orderings on
elements in the way suggested by the Haskell report and by the definitions
in the standard prelude.  I'm sending this to the Haskell list in the hope
that somebody can suggest a simple optimisation/transformation which
eliminates the problems below.

PROBLEMS
--------
First of all, let me say that this is only important because of the
need for derived instances of Ord ... if we didn't have that in mind,
a Haskell implementation could use any method at all to implement the
functions in the prelude, so long as the semantics were preserved.

To illustrate the basic problem, consider the definition of trees on
page 142, figure 19 of the Haskell report:

    data Tree a  =  Leaf a  |  Tree a :^: Tree a

One of the lines in the definition of the derived instance of Ord is:

    u :^: v  <=  x :^: y    =   u<x  ||  (u==x && v<=y)

The trouble is that this definition potentially requires two traversals
of the branches u and x of the two trees concerned -- one for the comparison
u<x and, assuming that fails, one for the comparison u==x.  What's worse
is that these traversals may be repeated further times when comparing nodes
higher up in the tree (i.e. closer to the root).  The same problems will
occur with other derived instances (and similarly structured user defined
instances too no doubt).

Another related problem is in the default definitions of max and min:

    max x y | x >= y    =  x
            | y >= x    =  y
            | otherwise =  error "max{PreludeCore}:no ordering relation"

This time, two traversals of the data structure are required, in order
to provide a sensible definition when the ordering concerned is partial.
However, in practice, I suspect that most orderings that are likely to
be used will actually be total and it seems rather a shame to incur the
penalty of multiple traversals. 

Another example where the problem arises is in the default definition for
(<) in terms of (<=) and (/=):
 
    x < y    =    x<=y  &&  x/=y

To my way of thinking, the problem is caused by the fact that, whilst
comparing two objects, none of the functions (<=), (<), (>=) or (>)
return sufficient information to completely determine the order
relationship between them.

A SOLUTION
----------
Here is a sketch of one potential solution to the problems above.  The
basic principle used here is not a new idea, and I'm not suggesting that
the formulation I've given is the best one for the job.  I just wanted
to put forward a concrete suggestion to get the ball rolling.

The result of a comparison is an element of type:

    data Order  =  LT  |  EQ  |  GT  |  NR

and the comparison function  comp :: Ord a => a -> a -> Order -> Order
is intended to be defined so that:

    comp x y o  |  "x is less than y"          =  LT
                |  "x is equal to y"           =  EQ
                |  "x is greater than y"       =  GT
                |  "x and y are incomparable"  =  NR

The type class Ord can be defined by:

    class Eq a => Ord a where
        comp                  ::  a -> a -> Order -> Order
        (<=), (<), (>=), (>)  ::  a -> a -> Bool
        max, min              ::  a -> a -> a

        (<=)    =  select True  True  False False
        (<)     =  select True  False False False
        (>=)    =  select False True  True  False
        (>)     =  select False False True  False

        max x y =  select y y x (error "no ordering") x y
        min x y =  select x x y (error "no ordering") x y

    select lt eq gt nr x y
        = case comp x y EQ of LT -> lt;  EQ -> eq;  GT -> gt;  NR -> nr  

And derived instances might now contain lines such as:

    comp (Leaf n)       (Leaf m)      = comp n m
    comp (u :^: v)      (x :^: y)     = comp u x . comp v y
    comp (C x1 ... xn)  (C y1 ... yn) = comp x1 y1 . ... . comp xn yn
    etc...

This approach eliminates each of the problems described above (and works
just as well if we're dealing with partial orders, without damage to the
case where only total orders are involved). 


THERE'S NO SUCH THING AS A FREE LUNCH
-------------------------------------
There has to be a catch of course.  Here it is (and perhaps you can find
other flaws?):

How can we ensure that the definition of equality using comp, and
corresponding to the use of ``equals'' in ``less than or equal'' (<=),

i.e.        (==) = select False True False False

agrees with a user defined equality operator?  I don't think we can unless
we restrict our use of this implementation to types where both Eq and Ord
are derived, in which case the above definition for equality gives the
standard derived equality.

Come to think of it, this is also a problem with the current approach
in Haskell (and another instance of the ``what can I assume about the
properties of a member function'' problem raised in recent articles):
If I give an explicit definition of the instance Eq t for some type t
but use a deriving clause to obtain an instance of Ord t, can I assume
that:
                  x == y    iff    x<=y && y<=x   ?

Perhaps a general solution here would be to change the language so that
if Ord is derived for a particular type then Eq must also be derived?

[P.S. (Being pedantic) There's a minor typo in the report at the bottom
of p.135:   True > False == True  should not parse correctly because both
operators are declared as infix 4 (although the meaning is quite clear).]

Any comments?
Mark
Efficient derived orderings ...?

Reply via email to