Mutually-recursive/cyclic module imports

Isaac Dupree Fri, 15 Aug 2008 06:27:15 -0700

Haskell-98 specifies that module import cycles workautomatically with cross-module type inference.

It has some weird interactions with defaulting and themonomorphism restriction. In Haskell-prime we're planningon removing artificial monomorphism, but defaulting willstill be necessary (and can still be set differently permodule).

Only JHC fully implements the recursive module imports ofHaskell-98.GHC and NYhc each have their own proprietary "boot-files"with slightly odd semantics to allow this to work (albeitthe syntax is simple enough)

Hugs doesn't support it at all.

I propose we simplify things and lay down some rules,without having to invent explicit module-interfacesignatures. Then I wouldn't complain(:-)) that GHC doesn'thave reasonable support for cyclic modules [1][2].(Compiler writers will have to give feedback how plausiblethis is :-) -- I think GHC and NYhc "should" be able toadapt their boot-interface-file mechanisms to the scheme I'mproposing..

(This is really more of a sketch than a complete proposal atthis stage.)

In particular, I propose an amount of annotation in a modulethat *shall* make it compile. Compilers are free to acceptcode for other reasons (e.g. .hs-boot files, or someofficial module interfaces). These first proposals areclean-ups that reflect how ridiculous people think thecurrent standard's module interface semantics are comparedto most languages. Also they make cross-module typeinference unnecessary, eliminating the defaulting problem.

namespace level: Haskell98 says that what a module exportsis determined by the smallest fix-point of what is possible.I can't see a practical use for this behavior, which iseasily confusing. I think that exports that depend on theresult of a fix-point should be rejected. It can be usefulin module A to import a few types/functions explicitly froma module B that then goes on to export the whole of module Athough.

type level: Inside any given SCC (loop) of modules, anyfunction imported from another member of the SCC normallyshall have an explicit type signature in the module thatexports it. (This doesn't seem a great burden, sincetype-signature for top-level functions/values are consideredgood practice anyway. Can anyone think of a use-case wherecross-module type inference would be particularly useful?)

Exception: imports may be given the {-# SOURCE #-} pragma.This fulfills two purposes:(1) It is a hint to a compiler that compiles modulesseparately that the current module should be compiled beforethe module being imported with {-# SOURCE #-}. Obviously,this can make optimization worse, since it's likely thatSOURCE-imported functions won't be strictness-analyzed orinlined or anything; but that's the .hs-boot situationalready. (And in principle even a compiler that likesseparate compilation could break individual functions downinto dependency order to compile them, adding anothertradeoff point...)(2) If SOURCE pragmas "break the loop", then only functionsthat are actually imported with SOURCE must be given typesignatures, even if module B then goes on to import module Awholesale: example:

module A where {import {-#SOURCE#-} B (bf); ...}
module B (module A, module B) where {import A; bf :: ...; ...}

Since defining data types in logical places is an importantuse of cyclic imports, I propose not to require any extraannotation for them; the compiler will have to chase themdown and understand them in loops (how else to do it?).However, there are some particular things to keep in mindregarding potential recompilation:

(with a bit of a GHC bias)

Changing any orphan instances in an SCC will force the wholething to recompile (but what pluckiness, putting orphaninstances *there*!)If a data type or newtype is imported without itsconstructors, then the RHS changing doesn't really force arecompile.I imagine this could work in GHC by, for each SOURCE import,storing the MD5 of the imported interface. Then whenchecking if you seriously have to recompile module A, youdon't have to if none of those MD5s have changed and none ofthe non-SOURCE-imported modules' interface MD5s have either.In module cycles that aren't explicitly broken by SOURCEs,GHC (or any compiler) should just insert an implicit SOURCEfor *all* cyclic imports (and possibly emit a warning)(unless the compiler wants to guess which SOURCES are betterfor optimization?). Presumably compilers that can doseparate as well as non-separate compilation could take anoptimization flag that tells them to compile cycles togetheras one piece rather than obeying the SOURCES forrecompilation efficiency.

so what does the compiler have to look at in aSOURCE-imported modules?

In the case of the proposed SOURCE imports without hs-bootfiles, GHC would move from calculating one interface(md5)per module (or two interfaces in the case of .hs-boots), toone-per-import. I think this is, in principle, anadvantage, although it does require more re-scanning whenfiles are changed (only lexer/parser/renamer/module-chaserwork). For example, I've found myself adding to .hs-bootfiles for the purpose of one module that SOURCE-imports the.hs-boot, which forces the recompile of another module thathappens to depend on the .hs-boot too. To replicate thecurrent GHC .hs-boot behavior (in which thehash-recalculation is shared among SOURCE-importers), onecould replace a X.hs-boot file with an X_boot.hs file thatcontains:

        module X_boot (module X) where
        import {-# SOURCE #-} X (list of things exported
                               by the old .hs-boot file)
, and in other modules, replace
        import {-# SOURCE #-} X (....)
with
        import X_boot (....)

Taking .hs-boot docs as a guide [2], the compiler must lookin SOURCE-imported modules for:

- if an import list is given explicitly, `B (....)` not `Bhiding (....)` or `B`, the export list only needs to be*checked* to make sure it exports the requested things, notremembered. Exception: data or class imported with`Name(..)` must remember exactly which constructors/memberswere exported. It's recommended to specify exactly whatyou're importing.

- function type signatures

- imports of functions, types, etc. If it's imported fromoutside the SCC, it doesn't need a type signature/whatever.If it's defined somewhere within the SCC, it generally doesneed a type signature.- fixity declarations, which only have to be imported inconjunction with the correspondingfunctions/constructors/whatever- data type / newtype declarations. When no constructor isimported, only the *kind* of the data type needs to berecorded, which might have to involve inference on the RHS(possibly involving more import chasing) if there aren'texplicit kind annotations for *every* type parameter.- type synonym declarations. The whole thing has to beimported, including RHS.- classes. Including superclasses, class-method signatures,and default methods? Is there some way that GHC manages toallow not declaring all of these in .hs-boots?- instances, whether generated by 'deriving', 'derivinginstance', or ordinary 'instance'; everything before the"where" clause of 'instance's is relevant. But an instanceis only relevant if it's orphan, or if goes with a data orclass that's also being imported.- the compiler-specific RULES pragmas probably followsimilar mandates as above for instances and for thefunctions referenced in the RULE.

[1] my official "complaint":http://hackage.haskell.org/trac/ghc/ticket/1409[2] the GHC .hs-boot docs:http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#mutual-recursion

_______________________________________________
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime

Mutually-recursive/cyclic module imports

Reply via email to