Changes to the Haskell 1.3 Prelude The following changes have been proposed (or accepted) for Haskell 1.3. * Reorganize the Ord class * Add succ and diff to Enum * Add new class "Bounded" * Add strictness annotation to Complex and Ratio * Use Int in take, drop and splitAt * Add replicate, lookup, curry and uncurry * Move functions into libraries * Non-overloaded versions of PreludeList functions * Numeric Issues * Simplify lex * Add undefined * Monad Class Changes to Ord In Haskell 1.2, two comparisons are required to do a "three way branch": if x == y then ... else if x < y then ... else ... Even a standard two way branch can be inefficient - here's the default definition of "<" in the standard prelude: x < y = x <= y && x /= y Instead of defining a <= operator which returns just two values, it is almost as easy to define an operator which returns three different values: case compare x y of EQ -> ... LT -> ... GT -> ... The constructors EQ , LT ,and GT belong to a new type: Ordering. In addition to this efficiency problem, many uses of Ord such as sorting or operations on ordered binary trees assume total ordering. The compare operation formalizes this concept: it can not return a value which indicates that its arguments are unordered. Programmers are free to define a class for partial orderings; here, we simply state that Ord is reserved for total orderings. Proposed Changes: * Add a new type: data Ordering = LT | EQ | GT deriving (Eq,Ord,Ix,Enum,Bounded,Text) * Delete comment in definition of class Ord which explains how to define min and max for both total and partial orders. * Change definition of Ord to class Ord a where compare :: a -> a -> Ordering (<), (<=), (>=), (>):: a -> a -> Bool max, min :: a -> a -> a -- circular default definition: -- either <= or compare must be explicitly provided x < y = compare x y == LT x <= y = compare x y /= GT x > y = compare x y == GT x >= y = compare x y /= LT compare x y | x == y = EQ | x <= y = LT | otherwise = GT max x y = case compare x y of LT -> x _ -> y min x y = case compare x y of LT -> y _ -> x * Change definitions of Ord instances in PreludeCore. At present, Ord instances define the "<=" method. These should be deleted and replaced by definitions of the "compare" method. * Add this sentence to Appendix E: "The operator compare is defined so at to compare its arguments lexicographically (with earlier constructors in the datatype declaration counting as smaller than later ones) returning LT, EQ and GT (respectively) as the first argument is strictly less than, equal to and strictly greater than the second argument (respectively)." The methods >, >=, <, <= could be removed from Ord and turned into ordinary overloaded functions. For efficiency, these could be specialized; the GHC specialize pragma allows an explicit definition of a function at a particular overloading: Specialize (<=) :: Int -> Int -> Bool = primLeInt Add succ and diff to Enum Haskell 1.2 provides very limited facilities for operating on enumerations. The following elementary operations must be implemented in an obscure and inefficient manner, if at all: * Get the next value in enumeration: (\ x -> head [x..]) * Get the previous value in enumeration: no reasonable way * Get the n'th value in enumeration: [C0..] !! (n - 1) (where C0 is first in enumeration) * Find where a value occurs in an enumeration: lookup (zip [C0..] [0..]) x Proposed changes: * Add two new methods to Enum: succ :: Int -> a -> a diff :: a -> a -> Int Informally, given an enumeration: data T = C0 | C1 | ... Cm we have: diff Ci Cj = i - j succ x Ci | 0 <= i+x && i+x For example, given the datatype and function: data Colour = Red | Orange | Yellow | Green | Blue | Indigo | Violet toColour :: Int -> Colour toColour i = succ i Red we would have: toColour 0 = Red toColour 1 = Orange ... toColour 6 = Violet * Change definitions of Enum instances: instance Enum Char where succ = primCharSucc diff = primCharDiff enumFrom = boundedEnumFrom maxChar enumFromThen = boundedEnumFromThen minChar maxChar boundedEnumFrom hi x | x * Change description of derived instances of Ix for enumerations and Enum: Given the enumeration: data c => T u1 ... uk = K1 | ... | Kn deriving (C1,...Cm) succ i Cj returns C(i+j) if 0 Enum a where enumFrom = boundedEnumFrom Kn enumFromThen = boundedEnumFromThen K1 Kn and the derived Ix instance is defined by: class (Ord a) => Ix a where range (l,u) = [l..u] index (l,u) i | inrange (l,u) i = diff i u inRange (l,u) i = l Notes: * The types of succ and diff could be changed to: succ :: Integer -> a -> a diff :: a -> a -> Integer The only purpose in doing so would be to allow correct operation across the full range of integers. *The definitions of boundedEnumFrom[Then] are carefully chosen to avoid overflow. In particular, to avoid: [True..] = True : error "succ{PreludeCore}: result out of range" [maxChar..] = maxChar : error "succ{PreludeCore}: result out of range" This problem was present in Haskell 1.2 for the types Int, Float and Double and could be avoided for Int using boundedEnumFrom instead of numericEnumFrom and for Float and Double using similarly defined functions. * The functionality of diff is almost provided by index. The test for an upper bound is distracting. * There are overflow and rounding problems associated with Integer and floating point numbers. These do not affect the old Enum methods, which can be implemented as before, but may lead to overflow. The Bounded class Many built-in types and user-defined types in class Ord have a natural minimum and maximum value. Haskell 1.2 uses several ad-hoc methods for accessing these values. For example, the names minInt, maxInt and minChar, maxChar are defined in the prelude and implicitly used in appendix E. (Our proposed changes involving succ and diff would make those uses explicit.) It would be much more regular to define a new class Bounded whose instances have a minimum and maximum value. This also allows the functions ord and chr (currently restricted to Chars) to be generalized to work for any bounded enumeration type. Proposed changes: * Add class Bounded: class (Ord a) => Bounded a where minBound, maxBound :: a -- forall x::a. minBound * Add instances for all appropriate datatypes: instance Bounded Char where minBound = '\0' maxBound = '\255' instance Bounded Int where minBound = primMinInt maxBound = primMaxInt instance Bounded Float where minBound = primMaxFloat -- new primitive maxBound = primMinFloat -- new primitive (negation of primMaxFloat?) instance Bounded Double where minBound = primMaxFloat -- new primitive maxBound = primMinFloat -- new primitive (negation of primMaxFloat?) instance Bounded a => Bounded (Maybe a) where minBound = Nothing maxBound = Just maxBound * Add deriving clauses or (), tuples, and Bool. * Changes to appendix E: Instances may be derived for any enumeration. For a type of the form data T = C1 | C2 | ... | Cm the derived instance is (generated by `deriving Bounded') instance Bounded T where minBound = C1 maxBound = Cm Instances may also be derived for single-constructor datatypes (in particular, tuples). For a type of the form: data c => T u1 ... uk = C t1 ... tm where each of t1 ... tm are instances of Bounded, the derived instances are instance c => Bounded (T u1 ... uk) where minBound = C minBound ... minBound maxBound = C maxBound ... maxBound * Replace the functions ord and chr with the functions: fromEnum :: (Bounded a, Enum a) => a -> Int fromEnum x = diff x (minBound x) toEnum :: (Bounded a, Enum a) => Int -> a toEnum x = succ (minBound x) x Notes: Adding a new standard class is a larger change than simply adding a new method to an existing class - so it's natural to wonder why. The primary reason is that the (unbounded) type Integer is an instance of both Ix and Enum but not of Bounded. One possible solution is to define minBound and maxBound to be (runtime) errors for type Integer and add the methods minBound and maxBound to Enum instead. (They should be renamed minEnum and maxEnum if this is done.) This has the problem that it defers detection of a (rare?) type error to runtime. Another possible solution is to remove Integer from Enum. (This would force Integer to be removed >from Ix as well). This is quite a radical change but probably wouldn't break any code. It isn't possible to provide Bounded instances for Float and Double using Haskell 1.2 primitives. Easily fixed. The names fromEnum and toEnum are misleading since their types involve both Enum and Bounded. We couldn't face writing fromBoundedEnum and toBoundedEnum. Suggestions welcome. Strictness annotations in Ratio and Complex Augustsson's FPCA'94 paper demonstrates very significantly lower memory requirements when these annotations are added to Complex. Adding similar annotations to Ratio should have a similar effect. Whereas changing the strictness of lists or integers would break many programs, only the most contrived programs will be affected by this change. Proposed changes: Change definitions of Complex and Ratio to: data (RealFloat a) => Complex a = !a :+ !a deriving (Eq, Binary, Text) data (Integral a) => Ratio a = !a :% !a deriving (Eq, Binary) Using Int in list operations The Haskell 1.2 prelude forces the use of Integer rather than the more efficient Int for the very common operations take, drop and splitAt. There are only two circumstances where the overhead of using Integer's is worthwhile: * To pass conformance tests such as: head (take (toInteger maxInt * 2) [1..]) == 1 * When dealing with very large datasets (more than 2^29 elements) Users of these very large datasets are probably already using implementations where maxInt == 2^31-1 or maxInt == 2^63-1 where the problem is less likely to occur. The old definitions can be placed in a library should problems arise. Proposed changes: Change the types of take, drop and splitAt to: take, drop :: Int -> [b] -> [b] splitAt :: Int -> [b] -> ([b]. [b]) Add genericTake, genericDrop and genericSplitAt (with old signatures and definitions) to a library. New functions: replicate, lookup, curry and uncurry The following functions are deemed sufficiently useful that they should be added to the Haskell Prelude. Proposed Changes: Add replicate and lookup to PreludeList: replicate :: Int -> a -> [a] replicate n x = take n . repeat x lookup :: Eq a => a -> [(a,b)] -> Maybe b lookup key [] = Nothing lookup key ((x,y):xys) | key == x = Just y | otherwise = lookup key xys Add curry and uncurry to Prelude: curry :: ((a, b) -> c) -> a -> b -> c curry f x y = f (x,y) uncurry :: (a -> b -> c) -> (a, b) -> c uncurry f (x, y) = f x y Notes: Adding functions in the Prelude rather than a library forces the automatic import of the name into every Haskell program. This may break existing code which already uses these names or prevent a programmer from using these names for their own functions. (This is why we chose the name replicate instead of copy.) It also expands the "vocabulary" of Haskell - adding an extra name which Haskell programmers can be expected to know. In previous discussion, the name assoc was preferred to lookup to avoid stealing a useful name from the user. The library proposal also defines LibFiniteMap.lookup and LibHashTable.lookup - we feel that stealing the one name for three purposes is reasonable. An alternative definition for uncurry is: uncurry f ~(x, y) = f x y This ensures that: curry (uncurry f) = f uncurry (curry f) = f Is this important? Moving functions to libraries A large prelude helps the programmer by providing a wide range of pre-written functions to use and combine. It also hinders the programmer by making the language larger to learn, preventing the programmer from using prelude names for their own purposes and slowing down compilation. It is possible to have our cake and eat it by simply moving rarely used functions into standard libraries (the language size remains the same but the "core language" gets smaller). The following changes are based on counting how many times certain functions are used in a large body of Haskell programs (i.e. the Glasgow Haskell repository). Proposed changes: The following functions will be moved into libraries. The exact library names have not yet been chosen. isAscii, isControl, isPrint, isSpace, isUpper, isLower, isAlpha, isDigit, toUpper, toLower, gcd, lcm, ^, ^^, scanl, scanl1, scanr, scanr1, sums, products, zip[3-7], unzip[3-7], transpose. The functionality in class Binary is being rethought. All support for binary I/O will be in a library. A number of prelude modules will be moved to libraries in their entirety: PreludeArray, PreludeComplex, and PreludeRatio. Non-overloaded entries to overloaded functions While the choice of an Eq instance for types such as Int and Bool are obvious, there may be more than one way to perform these operations on more complex datatypes. For example, string comparison may be either case sensitive or case insensitive. A type can have only one style of comparison defined in instances for Eq or Ord. This would normally present no problems, since the user can use some other function instead of == or < if needed. However, many prelude functions use == directly. We propose to supply an alternative version of every prelude function which uses a specific class method such as ==. These will pass the function as a parameter instead of using one defined in an associated instance. We use the suffix "By" to name these alternative functions. Proposed changes: Add generalised versions of the Eq/Ord-overloaded functions in PreludeList: nubBy, elemBy, notElemBy, minimumBy, maximumBy, deleteFirstsBy, deleteBy, lookupBy For example: nubBy :: (a -> a -> Bool) -> [a] -> [a] nubBy eq [] = [] nubBy eq (x:xs) = x : nubBy (filter (not . eq x) xs) The original functions will be changed to use new non-overloaded functions. For example: nub :: Eq a => [a] -> [a] nub = nubBy (==) Notes: The name deleteFirstsBy (as a pronunciation for \\) is not ideal. We welcome any suggestions. These functions will not be used as heavily as the existing functions and so could be placed in a separate library (LibList). However, we need a consistent rule about where to put the "fooBy" functions --- and "in the same module" is the simplest. Though we have found most of these functions useful in our own programming, we haven't used all of them. We considered it more important to be consistent (by adding all of them) than to try to guess which were the most useful. Numeric Issues Haskell 1.2 provides no way to detect IEEE arithmetic values such as "NaN" or "Infinity". This seemed reasonable since Haskell 1.2 (and 1.3) doesn't require IEEE arithmetic. However, there are no Haskell implementations on non-IEEE hardware (we have heard rumors of a Cray implementation "soon") and a few programmers do want to be able to perform "serious" numerical work in Haskell. Providing everything numerical analysts could ask for requires a large amount of work but a few simple additions would make it possible to perform a few basic operations. Proposed Changes: Add methods isNaN, isInfinity, isDenormalized and isIEEE to RealFloat: isNaN, isInfinity, isDenormalized, isIEEE :: RealFloat a => a -> Bool In non-IEEE implementations, these all return False. In IEEE implementations, isIEEE returns True and all others return appropriate responses. Add line to definition of showFloat: showFloat x = if isNaN x then showString "<NaN>" else if x == 0 then ... -- remainder of function is unchanged Notes: These changes do not preclude a library for full IEEE compatibility. Such a library is under development. There is also a need for versions of showFloat which provide control over the format used and the number of digits displayed (like the E,F and G format controls in C's printf function). These will be provided in a library. Simplified Lex The lex function is very inefficient and overly complex. Some of this complexity is necessary to ensure that almost any value that is printed out using standard Text instance can be read back in. (Exceptions: functions and infinite data structures.) However, lex is able to parse several tokens that are never produced by these Text instances. Proposed Changes: Remove support in lex for: * Both forms of comment (change to lex) * <- (change to lex) * "^<ctrl-char>" tokens (change to lexLitChar) * octal tokens (change to lexLitChar) * hex tokens (change to lexLitChar) * ASCII abbreviations for control characters (change to lexLitChar) Move isOctDigit and isHexDigit to the same module as functions like isAscii and isControl. Notes: This won't make lex or read any more efficient. The problems are: * lex is built using the same fully backtracking technology as read (it returns a list of alternative parses instead of a single token) but the resulting flexibility doesn't seem to be used anywhere. This imposes a significant overhead with little reward. * lex produces strings rather than tokens. This means that every token has to be identified twice: once in lex where the input list is split into strings representing tokens and once in read where these strings are matched against required input. The most effective way of fixing this is to introduce a new datatype representing each token. * Derived instances of readsPrec (which is called by read) usually have the following form (see Appendix E): readsPrec d r = readCon K1 k1 "K1" r ++ readCon K2 k2 "K2" r ++ ... readCon Km km "Km" r In order to successfully parse a string starting with "Km", readsPrec will call the lex function m times. If lex were separated from read by preprocessing the input into a token stream, lex would only have to be called once. * When parsing a type with infix constructors, the derived instance of readsPrec may relax and reparse the same expression many times. * Though a good start, this doesn't make lex much easier to understand. A more effective specification would be a grammar or regular expression. This would also simplify the task of feeding the specification into a Haskell lexer/parser generator or some similar (more appropriate) implementation technology. * Haskell 1.3 will support hex (and octal) specification of numbers. It isn't clear whether lex should be modified to support this. Undefined Add "undefined" to the prelude. Previous proposals used _ or _|_ as the name. undefined = error "undefined{Prelude}" It is expected that compilers will recognize this and insert error messages which are more appropriate to the context in which undefined appears. The Monad class The Haskell 1.3 IO proposal uses the symbols ">>", ">>=" and "return" for the IO monad operations and a separate proposal provides syntactic sugar to make monadic code even easier to write. The provision of constructor classes makes it possible to define these symbols as methods in a "monad class" allowing the symbols and sugar to be used for any monad. Proposed Changes: Add this class to PreludeCore: class Monad m where (>>=) :: m a -> (a -> m b) -> m b return :: a -> m a with instances for [], Maybe and IO. Notes: Most papers on monads also discuss "monads with zeroes" (examples are "[]" in the list monad, "Nothing" in the Maybe monad and parsing failure in the parsing monad). We have not yet decided whether to place this in the prelude or a standard library. A standard library will provide the usual range of monadic operations (maps, folds, filters, guards, etc). Minor Changes Improve error messages for maximum and minimum: maximum [] = error "maximum{PreludeList}: empty list" maximum l = foldl1 max l minimum [] = error "minimum{PreludeList}: empty list" minimum l = foldl1 min l The error messages for head, tail, init and last use "[]" in their error messages; this will be changes to "empty list" for consistency. Add these equations to end of definitions of take, drop and splitAt. take _ _ = error "take{PreludeList}: negative argument" drop _ _ = error "drop{PreludeList}: negative argument" splitAt _ _ = error "splitAt{PreludeList}: negative argument" Add backquote to list of symbols in definition of isSingle in definition of lex. Remove even/odd from Integral because we can't imagine why they're not just normal overloaded functions. Changes made possible by addition of records: Use records to define the Haskell 1.3 IO types CalenderTime and TimeDiff. The zipWith family of names is being renamed to map2, map3,... The old names will be left in for compatibility. Unfinished Business These issues are still being debated. Naming: are the following names acceptable: succ, diff, fromEnum (maybe ord?), and toEnum. Should there be more IEEE support for standard floats? Are there more monad functions that ought to be in the prelude? The Monad0 class probably should be in the prelude. Useful functions for the IO monad would be too. Perhaps: done = return () when p s = if s then p else done unless p s = when (not p) s guard p = if p then failWith "Guard failure" else done