Introducing Haskell 1.3

This new version of the Haskell Report adds many new features to the
Haskell language.  In the five years since Haskell has been available
to the functional programming community, Haskell programmers have
requested a number of new language features.  Most of these features
have been implemented and tested in the various Haskell systems and we
are confident that all of these additions to Haskell address a real
need on the part of the community.  This revision to the Haskell
report is much more substantial than previous ones: many significant
additions are being made.  We have also streamlined some aspects
of Haskell, eliminating features which have been little used and
complicate the language.

The final version of the Haskell 1.3 is expected to be complete in
October, 1995.  A preliminary version of the report will be available
soon.  All significant changes to the Haskell language, as well as
their motivation, are described here.  We are still open to comments
and suggestions; please send mail to [EMAIL PROTECTED]
regarding Haskell 1.3.  I will be happy to answer any questions or
forward mail to either the Haskell mailing list or the 1.3 committee,
as appropriate.  Information about the design of Haskell 1.3 and other
proposed extensions to Haskell is available on the web at 

http://www.cs.yale.edu/HTML/YALE/CS/haskell/haskell13.html

There will be some minor incompatibilities with Haskell 1.2.  These
should not be serious and implementors are encouraged to provide a
Haskell 1.2 compatibility mode.



Overview


Haskell 1.3 introduces the following major features:
 * Standardized libraries and a reduced prelude
 * Constructor classes (as in Gofer)     
 * Monadic I/O                                 
 * Strictness annotations in type definitions  
 * Simple records                              
 * A new type mechanism               
 * Special monad syntax (`do')                 
 * Qualified names                             
 * All names are now redefinable
 * The character set has been expanded to ISO-8559-1

Many other smaller changes to Haskell 1.2 have also been made.  A
complete description of new, changed, and eliminated features follows.


Prelude Changes

Haskell 1.3 will make a number of minor changes to the standard prelude.
Many prelude functions will be moved to libraries, reducing the size
of the Haskell core language.  These changes will be described separately.


Standard Libraries


As Haskell has grown, many informal libraries of useful functions have
been created.  In Haskell 1.3, we have decided to standardize a set of
libraries to accompany the core language.  Some of the functions
formerly in the prelude are now in libraries, decreasing the size of
the core language and giving the user more names in the default
namespace.  We are dividing the Haskell report into two separate
documents: a language report and a library report.  The prelude, now a
little smaller, will be described in the language report.  The library
report will continue to evolve after the 1.3 language report is complete.
We have moved much of the I/O, complex and rational arithmetic, many
lesser used list functions, and arrays to the libraries and also
developed a number of completely new libraries.  An initial Haskell
library report will be available at the same time as the 1.3 language
report.



Constructor Classes


We have observed that many programmers use Gofer instead of Haskell
to use Gofer's constructor classes.  Since constructor
classes are well understood, widely used, and easily implemented we
have added these to Haskell.  Briefly, constructor classes
remove the restriction that types be `first order'.  That is, `T
a' is a valid Haskell type, but `t a' is not since
`t' is a type variable.
Constructor classes increase the power of the class system.  For
example, this class definition uses constructor classes: 

  class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b
    return :: a -> m a

Here, the type variable `m' must be instantiated to a polymorphic data
type, as in

  instance Monad [] where
    f >>= g = concat (map g f)
    return x = [x]

No changes to the expression language are necessary; constructor
classes are an extension of the type language only.

Constructor classes require an extra level of type information called
`kinds'.  Before type inference, the compiler must perform kind
inference to compute a kinding for each type constructor.  Kinds are
much simpler than types and are not ordinarily noticed by the programmer.

The changes to Haskell required to support constructor classes are:

 * The syntax of types includes type application.
 * Built-in types have names: [] for lists, (->)
     for arrow, and  (,) for tuples.  Using type application,
     the type `(,) a b' is identical to `(a,b)'.
 * Type constructors (but not type synonyms) can be partially applied.
 * Type variables in interface files may be annotated with a kind.
     This will not affect any type variables used by Haskell 1.2
     programs.
 * The basic monadic operations: >>, >>=,
     and return, will be placed in the Monad class.
     All monads will share the same basic operations.




Monadic I/O


Monadic I/O has already become the de-facto standard in the various
Haskell systems.  We have chosen a fairly conservative, but extensible
basic design (an IO monad with error handling), 
the details of which are documented at
  http://www.dcs.gla.ac.uk/~kh/Haskell1.3/IO.html
This hasn't changed recently.
The old I/O systems (continuation & stream) have been totally
scrapped.  Most of the I/O system has been moved out to 
libraries.

The addition of  monad syntax  should improve the improve the
comprehensibility of programs that make heavy use of I/O. 


Strict Constructors

To achieve reasonable efficiency, Haskell programmers often use
implementation-specific annotations to mark fields of data structures as
strict.  These strict components are evaluated when the structure is
created instead of delayed until demanded.  This avoids the extra
overhead involved with delays and can result in much more compact
structures.  To help improve efficiency, we now supply a
simple strictness annotation on the types in a data declaration (or
record declaration - see below).  Using `!' in front of a type in a
data declaration marks a structure component as strict.  (The `!'
symbol is not a keyword; it has no special meaning outside data
declarations).  Only top-level types may be marked as strict; using
`!' elsewhere in a type is not allowed.  Any type in a structure
component may be marked as strict.

  data Foo a = F Int !Bool a a

This declares F as a constructor containing four fields: a
(non-strict) Int, a strict Bool and two (non-strict)
polymorphic fields.


Fully polymorphic strictness implies that values of any type may be
evaluated when placed into a strict data structure.  There are
compelling arguments for preventing functions from being evaluated in
this manner.  Without going into detail about the problems associated
with forcing functions, we have chosen to prevent function
forcing by using a special class, Data, to explicitly mark type
variables which must not be instantiated as functions.  Every data
type, except ->, is a member of the Data class.
If we want to make a polymorphic component strict, we must add
a "Data" context to the declaration:

  data Data a => Foo a = F Int !Bool !a a


The prelude will use strictness annotations in the definition of
Ratio and Complex, making these numeric types much
more efficient.   To avoid unnecessary propagation of the Data class,
it is a superclass of Num.


The functions `strict' and `seq' have been added to
the prelude to give the user additional control over evaluation.
While no strictness annotations have been proposed for functions,
these primitives can be used to implement strict evaluation when needed.

seq :: Data a => a -> b -> b
strict :: Data a => (a -> b) -> (a -> b)
strict f = \x -> seq x (f x)

The seq function evaluates its first argument before
returning the second one.  strict turns an ordinary function
into a strict one.

Strict data constructors accumulate all arguments before evaluating
them:

data R = R !Int !Int

R x y = seq x (seq y (makeR x y)) -- just to show the semantics of R

y = R undefined  -- Not an error 

 
Haskell 1.3 will not provide strictness annotations for functions.
These can be constructed explicitly using seq if needed.  We
are not sure how strictness should interact with partial application.



Records


The individual components of data types in Haskell 1.2 are completely
anonymous: the type definition only names the type and the
constructors, not the fields of a constructor.  It is 
inconvenient to select from, construct, or modify values associated with
a constructor which has many components.  A number of more general
record structures have been proposed and implemented.  After much
discussion, we have decided to avoid the issues of inheritance or
object oriented programming for the moment and provide a simple syntax
which will allow the fields of a single-constructor data type to have
names.  Selection, construction, and update operations which use field
names are provided.  The layout rule applies to all uses of
{;} in the following syntax.

Records are declared using a new type declaration:

  topdecl -> data [context =>] simple = con with {sig1;sig2; ...}  [deriving classes]
  sig -> var1, ..., varn :: [!]type    n >= 1

Example:

  date Point = Point with pointX, pointY :: Float deriving Text

This names a type, Point, a constructor, also Point,
and two fields: pointX and pointY.  Field names are
in the same namespace as other Haskell values; a field may not share a
name with other field names or variables in scope.  At present,
records are restricted to single constructor data types.

The field labels in the record can be used as selection functions.  In
this example, the functions

  pointX, pointY :: Point -> Float

can be used to access the fields of a point.

Construction of a new record uses the constructor and a
`with' clause to initialize fields.  Any fields which have
been marked as strict must be mentioned during construction.
Non-strict fields are initialized to bottom if not mentioned.

  exp -> con with {fdef1,...,fdefn}    n >= 0
  fdef -> var := exp

Example:

  p = Point with 
           pointX := 1
           pointY := 2

The with construct can also be used to (non-destructively) update
record values:

  exp <- exp with {fdef1,...,fdefn}    n >= 1

Example:

  q = p with pointX := pointX p + 1


The `:=' operation can also be used by itself as an anonymous
update function:

  exp -> var := exp

Example:

  f :: Point -> Point
  f = pointX := 1


Record types cannot be referenced in the pattern language.

Aside from the special syntax needed to access record fields, records
are ordinary algebraic data types.

The design of a record system for Haskell has been difficult.  There
are many obvious and useful extensions to the basic notion of records
which have been considered.  In the end, instead of endorsing any more
advanced record features (inheritance, object oriented programming,
defaulting field values on construction, enhanced polymorphism,
pattern matching) we have decided to introduce the bare minimum of new
functionality to the language.


New Types

Haskell users have often needed to declare a new name for an existing
type.  There were two ways of doing this: declaring a synonym, as in

  type Foo = Int

or creating a type wrapper, as in

  data Foo = Foo Int

This first approach has no runtime overhead, but does not distinguish
the new type from the old.  More seriously, instances cannot be attached to
the new type, only the old one.  The other approach incurs an 
additional overhead in the representation.

An additional type declaration will provide a more efficient way of
creating a new type from an existing one:

  newtype context => simple = con type [deriving classes]

Examples:

  newtype Foo = Foo Int deriving Text
  newtype Bar a = Bar [a] deriving (Text,Eq)
  newtype LongName = L Int

This new type is used in the same manner as a wrapper type and,
similarly, the type name need not match the constructor name.  The
constructor is used for explicit coercion between the new and existing
types.  Derived instances simply re-use instances already attached to
the existing type (except for Text, which prints the
constructor as it would for a wrapper type).

At first glance, it looks as though the newtype "Foo" could have been
defined using normal data types and strictness annotations.

  data Foo = Foo !Int deriving Text

The difference between the two is rather subtle (but important):
 * Using strictness annotations, evaluation of the field occurs when a
   Foo is created.  That is:

  case Foo undefined of Foo _ -> True = undefined

 * Using newtype, evaluation of the field occurs when the field is
   used.  That is
  
  case Foo undefined of Foo _ -> True = True

 *It is perfectly legal to write:

  newtype ST s a = ST (s -> (s,a))

but not to write:

  data    ST s a = ST !(s -> (s,a))

since that would require a Data context for function spaces.



Monad Syntax

To make monadic programming more readable, `do' syntax has been added
to Haskell:

  Syntax:                            Expansion:
  exp  -> do {stmt}
  stmt -> exp                        exp
  stmt -> pat <- exp ; stmt         exp >>= \pat -> do stmt
  stmt -> exp ; stmt                 exp >> do stmt
  stmt -> let decls ; stmt           let decls in do stmt

The Haskell layout rules already allow you to write:

  let 
    x = 1
    y = 2
  in x + y

as an abbreviation for:

  let { x = 1; y = 2 } in x + y

By the same rules, it is possible to write:

  do 
    x <- getInt
    y <- getInt
    return (x + y)

as an abbreviation for:

  do {
       x <- getInt;
       y <- getInt;
       return (x + y)
     }

C programmers might prefer the latter.

As in a list comprehension, variables bound by patterns are scoped
over the following statements in the do.  Unlike list
comprehensions, pattern match failure results in a runtime error.  The
do has the same type as the last statement.

A let in a do is scoped over all remaining statements.  Note there is
no in; this distinguishes this from an ordinary let. 
 
Haskell 1.3 omits several features from earlier proposals and
the Gofer implementation.  In particular, there is no support for "guards".

  do 
    x <- foo
    if even x
    bar


The problem is that in some monads, guard failure should yield a zero
while in others it should cause an exception or even an error (with
an appropriate error message).  Fortunately, there's no need to wrestle
with these thorny semantic issues since it's trivial for programmers to
define and use their own "guard functions".


 -- raise an error if condition fails.
 assert :: Monad m => Bool -> String -> m ()
 assert p msg = if p then return () else error ("Assertion failed: " ++ msg)

 -- return a zero if condition fails.
 guard :: MonadZero m => Bool -> m ()
 guard p msg = if p then return () else zero

  do 
    x <- foo
    assert (even x) "foo didn't return an even result at line 32 of Bar.lhs"
    y <- bar x
    guard (x `elem` y)
    return y


Qualified Names

Current Haskell practice is for all outside names to be brought into
scope using `import'.  This has some disadvantages: renaming
is required to resolve name clashes between different modules and the
names themselves provide no hint of where they come from.  In the
worst case (infrequently used names which have been imported without
explicitly listing what is being imported), this forces programmers to
use grep (or similar technology) to locate the definition of the name.

Qualified names are built from a module name and a local name,
separated by a `.'. Thus, 
`Prelude.foldr' refers to the foldr exported from
Prelude.  (Note that qualified names are not the same as
`original names': an 
original name uses the defining module; here, the exporting module is
named).  Qualified
names have two advantages:

 * Using qualified names prevents or resolves name clashes between
   different modules.
 * Qualified names result in more readable code since import declarations
   need not be consulted to find the defining module for a name.


A qualified name can be used for any entity.  Short names, omitting
the module and separating `.' can be used only for locally defined
names, or for entities which are imported unqualified (i.e. ones
mentioned in an import declaration which does not use the "qualified"
keyword).

For example, the following import declaration brings a module name into scope,
but only for use with qualified names.  The module may be locally
renamed, allowing 
a shorter or more descriptive module name to be used in the importing module,
or allowing the
imported module to change with minimal effect on the program.

  topdecl -> import qualified con [as con]

Only modules brought into scope using a qualified import can appear in
qualified names. 

Qualified names are defined in the lexical syntax.  Thus,
`Foo.a' and `Foo . a' are quite different.  No whitespace is permitted in
a qualified name.  Symbols may also be qualified: `Prelude.+' is
an operator which can be used in exactly the same manner as `+'.

Note that the `.' operator presents a syntactic problem.  It causes
`Foo..', which is widely used in export lists and occasionally in
arithmetic sequences, to parse as a qualified name instead of a usage
of the `..' token.  Inserting a space before the
`..' will resolve this problem.


Redefinable Names


A minor complaint about Haskell has been that names defined in
PreludeCore cannot be redefined in any way.  Since many operators,
like +, -, >, and ==, are defined in PreludeCore this has made some
users deeply unhappy --  this restriction prevents any sort of
alternative numeric class structure which uses the standard symbols,
for example.  There is no real reason for stealing all these names
>from the user; in Haskell 1.3 PreludeCore is no longer special: all
ordinary names are redefinable.  Special syntax will remain attached
to Prelude names; thus `[x]' would always refer to lists as defined in
the Prelude. 

As before, the Prelude module is implicitly
imported (in unqualified form) unless an explicit import is found.
There is also an implicit (and unavoidable) qualified import of the
Prelude which is used to define the meaning of various pieces of
syntactic sugar.  This eliminates the need to make PreludeCore symbols
immutable.
Since qualified names can always be used for imported entities,
any Prelude entities that the programmer has chosen to shadow
can still be referred to using `Prelude.'.



Expanded Character Set

This change in character set supports both an expanded Char type at
execution time and a greatly increased vocabulary for constructing
program names.  Many new characters have been added to `small',
`large', and `symbol' in the lexical syntax.  The
revised report will have full details.


Other Changes to Haskell

C-T Rule relaxation

The C-T rule restricts the placement of instance declarations to the
module containing either the class or data type definition associated
with an instance.  This is unnecessarily restrictive and will be
relaxed in Haskell 1.3.  

The new restriction will be that no more than one instance for a given
class - datatype pair is allowed in a program.  (Sadly, there's little
hope of detecting violations of this rule before link time.)

The visibility of instance declarations presents a problem.  Unlike
classes or types, instances cannot be mentioned explicitly in export
lists.  Instead of changing the syntax of the export list, we have adopted a
simple rule: when a module exports a class or a type, any associated
instances in scope are also exported.  Thus, the module

  module I where
  import M(C,T)

  instance C T where ...

would not export the instance C T.  On the other hand, adding
either C or T to the export list of I would
cause the instance to be exported from I.  Any module
importing C or T from I would have this instance 
declaration in scope.  This will not break any existing applications
since instances must appear in the module defining C or
T and will thus be exported with them if they are used elsewhere.


Polymorphic Recursion

This is a simple extension to the type system which allows the user to
attach a more general type to a recursive function than would be
normally be inferred.  For example, in

  f x y = if f True False then x else y

the type of f would normally be inferred as `Bool -> Bool -> Bool'.
However, polymorphic recursion allows the user to add a type signature
to f such as `f :: a -> a -> a' to give it a more general type.  While
polymorphic recursion is not likely to be needed by the average
Haskell user, it occasionally will prevent the type checker from
complaining about a quite sensible looking program.


Interface Files

The current report requires that interface files be transitively
closed and that the original names of all entities be supplied.  Such
interfaces are appropriate for compilers to generate, but are not
necessarily useful to the programmer.  Although individual compilers
can still support the old style of interface, the report will be
changed to allow interfaces which use exactly the same imports found
in the implementation.  Only locally defined entities need appear in
these interfaces.

 
List Comprehension Let Bindings

The syntax of list comprehensions is being expanded to include

  qual -> let {decls}

(The absence of `in' distinguishes this from a "let"-expression in a
guard).  These decls are scoped over the remaining qualifiers and the
generated expression.  Binding is irrefutable: pattern match failure
is a program error.  The expansion of this is trivial: it expands
directly into a conventional let.


Standard Annotations

A number of pragmas, already widely used, will be standardized.
Currently, the proposal calls for

  {-# Inline f #-}
  {-# NotInline f #-}
  {-# Specialize f :: type #-}

Exact syntax has not yet been worked out.  

Minor Syntax

Minor syntax changes include:

 * There is a new syntax for hex and octal constants
 * The interaction of layout and explicit braces has been clarified
 * The types listed in a default declaration must always be in parentheses
 * Empty export lists are now allowed
 * Presymbols are gone; this allows "~" and "-" to be embedded in a symbol
 * Modules with an omitted module name export only `main'
 * Extra commas are allowed in import / export / hiding lists
 * Interfaces cannot use `deriving'
 * \begin{code} - \end{code} is now allowed in literate files 
      as an alternative to ">" ("Bird tracks")


Features Removed from Haskell

A number of features have been removed from Haskell.

Import declarations no longer have a renaming clause.  Qualified
names should be used to handle name clashes.  This also removes the
need for the `single name' rule (see Section 5.1.2 of the report).

n+k patterns are being removed from the language.  While this will no
doubt break some existing code, it removes a highly irregular case in
both the syntax and semantics.  (The convoluted decl syntax came as a
result something like `x+1 = 2', which could either be an
infix definition of + or a pattern binding of x to
1.  The parse for  `x+1 = 2' and `(x+1) = 2' is completely different
at the moment!)  


 Unfinished Business 
  
These issues are still being debated.

 * The syntax for `newtype' is not certain.  It would be possible to
   fold this into an ordinary data declaration:

current:   newtype N a = N (T a)
proposed:  data N a as N (T a)  --? not sure of this syntax

   Separate keywords would perhaps be less confusing.  On the other hand,
   there is no distinction between these sorts of declarations in the
   type language.
 * Records could be 
   allowed for any constructor of a multi-constructor type.  If
   multi-constructor types are used, pattern matching must be able to (at
   least) discriminate on the constructor:

  data R = R1 with f1 :: Int |
           R2 with f2 :: Bool

  f :: R -> Int
  f x = case 
         R1 -> = f1 x   -- No fields allowed in pattern match
         _ -> 0

  f :: R -> Int
  f (R1 with f1 <- f) = f  -- Allows pattern match against fields
  f _ = 0

 * Record updates could be either strict or lazy:

record Point = Point with pointX, pointY :: Int
a = pointX (undefined with pointX <- 1)

With a lazy implementation, a is 1, while with a
strict implementation a is undefined.   The lazy semantics
allow records to be created by adding slots to a totally uninitialized
value (undefined) while, to accomodate strict slots, the strict
semantics require an explicit record creation construct.
Mark Jones sums up the debate thusly:

 Lazy semantics                      Strict semantics
 --------------------                ----------------

 Point is a first class value,       Point with ... is a language
 the uninitialized record            construct for building new points

 expr with fields is a construct     expr with fields is a construct
 for updating a record, non-strict   for updating a record, strict in
 in the expr part.                   the expr part.

 Arguably more in-line with          Arguably, not an issue.
 non-strict semantics of Haskell.

 Performance hit in naive            Easier to obtain a relatively
 implementations.                    efficient implementation.

At the naive implementation level, lazy semantics require fresh thunks
for all slots not being updated.  With the strict semantics, thunks in
the old record can be recycled.  Implementors wonder whether the naive
implementation can be easily optimized.

 * The field name namespace issue is (perhaps) unresolved.  The
   other candidate is ML style localized namespaces for field names.
   This would allow for simpler field names (x instead of
   pointX) but would make type inference more difficult.  Type
   hints may be required to disambiguate field names.
 * A qualified name could be made available either through an
   explicit qualified import or via any mention of a module in an import.
   Consider:

module Foo where
import M(x)
y = M.z + x

   If explicit qualified imports are required, this would be an error
   since there is no import qualified M(z) or just import
   qualified M.
 * Instead of a separate qualified import, module could be imported
   with an empty list of entities:

import M()

   This would allow M to be used as a qualifier.
 * There is a proposal in the works to make export lists more
   explicit, as in

module E (
        module M(..),
        class C,
        type T(..),
        instance M.C E.T,
        N.v,
        x
        ) where ...

   This would avoid the syntax problem associated with M.. at
   the moment.  It would also allow the class and type namespaces to
   overlap. 
 * Pattern match failure inside do could be
   a "zero" in a monad with zero instead of an error.  Unfortunately, the
   IO monad does not have a true zero ( failWith "error" doesn't
   quite work).  It would be possible to allow do syntax to be used
   with both Monad and Monad0 by eliminating or
   disallowing refutable patterns when the monad is not in
   Monad0.  Thus,  

do (x:xs) <- readLine
   return xs

   would result in a type error since the pattern (x:xs) is refutable.
   Using ~(x:xs) would work.



Reply via email to