Haskell-98 specifies that module import cycles work automatically with cross-module type inference.

It has some weird interactions with defaulting and the monomorphism restriction. In Haskell-prime we're planning on removing artificial monomorphism, but defaulting will still be necessary (and can still be set differently per module).

Only JHC fully implements the recursive module imports of Haskell-98. GHC and NYhc each have their own proprietary "boot-files" with slightly odd semantics to allow this to work (albeit the syntax is simple enough)
Hugs doesn't support it at all.

I propose we simplify things and lay down some rules, without having to invent explicit module-interface signatures. Then I wouldn't complain(:-)) that GHC doesn't have reasonable support for cyclic modules [1][2]. (Compiler writers will have to give feedback how plausible this is :-) -- I think GHC and NYhc "should" be able to adapt their boot-interface-file mechanisms to the scheme I'm proposing..

(This is really more of a sketch than a complete proposal at this stage.)

In particular, I propose an amount of annotation in a module that *shall* make it compile. Compilers are free to accept code for other reasons (e.g. .hs-boot files, or some official module interfaces). These first proposals are clean-ups that reflect how ridiculous people think the current standard's module interface semantics are compared to most languages. Also they make cross-module type inference unnecessary, eliminating the defaulting problem.

namespace level: Haskell98 says that what a module exports is determined by the smallest fix-point of what is possible. I can't see a practical use for this behavior, which is easily confusing. I think that exports that depend on the result of a fix-point should be rejected. It can be useful in module A to import a few types/functions explicitly from a module B that then goes on to export the whole of module A though.

type level: Inside any given SCC (loop) of modules, any function imported from another member of the SCC normally shall have an explicit type signature in the module that exports it. (This doesn't seem a great burden, since type-signature for top-level functions/values are considered good practice anyway. Can anyone think of a use-case where cross-module type inference would be particularly useful?)

Exception: imports may be given the {-# SOURCE #-} pragma. This fulfills two purposes: (1) It is a hint to a compiler that compiles modules separately that the current module should be compiled before the module being imported with {-# SOURCE #-}. Obviously, this can make optimization worse, since it's likely that SOURCE-imported functions won't be strictness-analyzed or inlined or anything; but that's the .hs-boot situation already. (And in principle even a compiler that likes separate compilation could break individual functions down into dependency order to compile them, adding another tradeoff point...) (2) If SOURCE pragmas "break the loop", then only functions that are actually imported with SOURCE must be given type signatures, even if module B then goes on to import module A wholesale: example:
module A where {import {-#SOURCE#-} B (bf); ...}
module B (module A, module B) where {import A; bf :: ...; ...}

Since defining data types in logical places is an important use of cyclic imports, I propose not to require any extra annotation for them; the compiler will have to chase them down and understand them in loops (how else to do it?). However, there are some particular things to keep in mind regarding potential recompilation:
(with a bit of a GHC bias)
Changing any orphan instances in an SCC will force the whole thing to recompile (but what pluckiness, putting orphan instances *there*!) If a data type or newtype is imported without its constructors, then the RHS changing doesn't really force a recompile. I imagine this could work in GHC by, for each SOURCE import, storing the MD5 of the imported interface. Then when checking if you seriously have to recompile module A, you don't have to if none of those MD5s have changed and none of the non-SOURCE-imported modules' interface MD5s have either. In module cycles that aren't explicitly broken by SOURCEs, GHC (or any compiler) should just insert an implicit SOURCE for *all* cyclic imports (and possibly emit a warning) (unless the compiler wants to guess which SOURCES are better for optimization?). Presumably compilers that can do separate as well as non-separate compilation could take an optimization flag that tells them to compile cycles together as one piece rather than obeying the SOURCES for recompilation efficiency.

so what does the compiler have to look at in a SOURCE-imported modules?

In the case of the proposed SOURCE imports without hs-boot files, GHC would move from calculating one interface(md5) per module (or two interfaces in the case of .hs-boots), to one-per-import. I think this is, in principle, an advantage, although it does require more re-scanning when files are changed (only lexer/parser/renamer/module-chaser work). For example, I've found myself adding to .hs-boot files for the purpose of one module that SOURCE-imports the .hs-boot, which forces the recompile of another module that happens to depend on the .hs-boot too. To replicate the current GHC .hs-boot behavior (in which the hash-recalculation is shared among SOURCE-importers), one could replace a X.hs-boot file with an X_boot.hs file that contains:
        module X_boot (module X) where
        import {-# SOURCE #-} X (list of things exported
                               by the old .hs-boot file)
, and in other modules, replace
        import {-# SOURCE #-} X (....)
with
        import X_boot (....)

Taking .hs-boot docs as a guide [2], the compiler must look in SOURCE-imported modules for:

- if an import list is given explicitly, `B (....)` not `B hiding (....)` or `B`, the export list only needs to be *checked* to make sure it exports the requested things, not remembered. Exception: data or class imported with `Name(..)` must remember exactly which constructors/members were exported. It's recommended to specify exactly what you're importing.
- function type signatures
- imports of functions, types, etc. If it's imported from outside the SCC, it doesn't need a type signature/whatever. If it's defined somewhere within the SCC, it generally does need a type signature. - fixity declarations, which only have to be imported in conjunction with the corresponding functions/constructors/whatever - data type / newtype declarations. When no constructor is imported, only the *kind* of the data type needs to be recorded, which might have to involve inference on the RHS (possibly involving more import chasing) if there aren't explicit kind annotations for *every* type parameter. - type synonym declarations. The whole thing has to be imported, including RHS. - classes. Including superclasses, class-method signatures, and default methods? Is there some way that GHC manages to allow not declaring all of these in .hs-boots? - instances, whether generated by 'deriving', 'deriving instance', or ordinary 'instance'; everything before the "where" clause of 'instance's is relevant. But an instance is only relevant if it's orphan, or if goes with a data or class that's also being imported. - the compiler-specific RULES pragmas probably follow similar mandates as above for instances and for the functions referenced in the RULE.

[1] my official "complaint": http://hackage.haskell.org/trac/ghc/ticket/1409 [2] the GHC .hs-boot docs: http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#mutual-recursion
_______________________________________________
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime

Reply via email to