This message makes some proposals about the module system
for Haskell 1.3. I'm circulating it to the full Haskell list
in the hope that we may get some new ideas that way.
One thing I would really like to know from the Great Haskell Public
is how pressing a problem the module system is. How bad would it
be to do nothing?
Simon
Background
~~~~~~~~~~
There has been quite a bit of debate in the Haskell 1.3 design
about the Haskell module system. The biggest complaints are these:
* Since interface files have to contain complete details about
everything they export, they become very bulky, and hence they
become almost impossible to write by hand.
* They have evolved into a way for the compiler to convey information
across module boundaries. An interface initially written by a
programmer is soon overwritten by the compiler!
In short, Haskell has no decent way in which a programmer can express
exactly what a module exports. It should be possible to do so, and
such a thing should be considered inviolate source code (like a type signature).
Yale have proposed some changes in 1.3 to address this situation,
which you can find in the 1.3 draft. I don't fully understand exactly
what is proposed, and if we are to make any changes at all I would
like them to move us to a stable position.
A proposal
~~~~~~~~~~
So, I tried to think about what a human-writable interface should be
like. Below I give an extended example of what I think it should be
like (borrowing heavily from Modula and Ada which, of course, have
thought though all this ages ago). This proposal has the following
characteristics:
* Interfaces are 100% human-written. If the compiler wants to have
extra info for separate compilation purposes then that's entirely a
separate matter.
* All information appears in exactly one place. If a module imports
something and re-exports it, then (as John P suggests) all that appears
in the interface is an import statement for that thing.
* Nor is there duplication of material between interface
and implementation.
For example, if an algebraic data type is exported with its
constructors, then the data decl occurs in the interface and *not* in
the impl. This is very important if interfaces are to be human written.
My data type decls get big and heavily commented. I don't want to
duplicate them.
* The language definition should not say whether an interafce lives in the
same file as its implementation, but compilers should probably allow it
to do so.
* As usual, instances are a problem. The same usatisfactory rules as now
will probably do.
The bottom line
~~~~~~~~~~~~~~~~
Here are the alternatives that seem viable to me:
a) Do nothing at all. Change is expensive, both for users and
implementors.
b) Do something like what I propose here (ie rather more upheaval
than Yale is proposing, but with a more consistent result)
c) Do higher order modules with records a la Mark Jones
(c) is a long term matter, so it's really (a) or (b).
============================ A commented example =====================
interface A where
import B(T1, b1, b2)
import C(T2(..))
-- As John proposes, these imports simply identify modules that export
-- the things mentioned... they might be re-exporting them, though, rather
-- than being the definition sites.
-- There is no further definition of T1, T2, b1, b2 in this interface,
-- (unlike Haskell 1.2 interfaces).
-- All these imports are implicitly exported from A
-- (if you don't want that, import them in the "module" part).
-- IMPORTANT: The imports must close the interface, all by itself.
-- This constraint is necessary in order to allow us to figure out
-- what (say) "T2" is.
--
-- This closure constraint is simple (same as when compiling a module);
-- it is not onerous (because you don't need to write out T2's data decl);
-- and, since the interface imports are automatically in scope in the
-- implementation part, it only amounts to moving imports from the
-- implementation part to the interface part.
--
-- NB: the closure constraint refers only to things mentioned *explicitly*
-- in this interface. For example, if b1 (imported from B) has
-- type X->X, there is no requirement
-- that X also be imported. That's another important way in which the
-- closure constraint is less onerous than the present system.
data S1 a = C1 a | C2 -- A data type defined by A, and exported
-- with its constructors
data S2 :: Type -> Type -- Two data or newtype types defined by A, and
data S3 :: Type -- exported *without* its representation
type K1 a = a -> a -- A type synonym
class C a where -- A class defined by A
op1 :: a -> a
op2 :: Int -> a
instance C T1 -- An instance decl made in A
f :: S1 -> T1 -- A function defined by A
==============================
module A where -- An implementation of A
-- All the imports and declarations of A's interface are
-- automatically in scope here; when compiling module A,
-- A's interfaces is (more or less) concatenated onto the
-- front. So the imports in the module need only
-- import stuff which isn't already imported by the interface.
-- Furthermore, we don't re-declare data type S1, type synonym K1
-- or class C, which are already fully defined in A's interface
import B(T1(..), b3) -- T1 was only visible without its constructors in
-- the interface; here we get its constructors too.
data S2 a = D1 a | D2 -- "Fills in" the data decl for S2
newtype S3 = S3 Int -- Ditto for S3
data S4 = Int -- S4 doesn't appear in the interface, so is
-- completely local
f x = b3 x x -- Decl for f, which is exported by A
f' p q r = ... -- A completely local decl
instance C T1 where -- Fills in the instance decl
op1 x = ...
op2 y = ...