No more graph errors?

Alastair Reid Fri, 7 Nov 1997 09:08:13 +0100 (MET)

Fellow bug chasers,

For as long as I can remember Hugs users have lived in fear of the dreaded

  INTERNAL ERROR: Error in graph

error message and have lain awake at nights wondering whether to trust
 Hugs with their programs.

Almost on the eve of releasing Hugs 1.4, "Carl R. Witty"
 <[EMAIL PROTECTED]> spotted a completely bogus piece of code and
 provided a bugfix which seems to fix this problem.  Hurrah for Carl!

This fix will be enabled in the next release - if anyone sees an internal
error while running any future release, we'd really like to hear about it!

Alastair


The rest of this message consists of:

o instructions for reproducing the internal error on an unpatched copy
  of Hugs; 
o a description of exactly what is going wrong in this example; and
o an explanation of how Carl's patch fixes the problem.

This is primarily aimed at Mark Jones and Carl - but the more people
 that understand Hugs innards, the better...

1) Reproducing the error:

   David Elworthy <[EMAIL PROTECTED]> provided this description of what
   caused the bug for him:

   - start hugs
   - lots of imported files get read
   - one of them has an error in it
   - fix the error
   - use :r to reload and complete loading
   - run the program => error

   After a bit of fooling around, I arrived at the following program:

     f :: Int -> String
     f x = show (x,x,x)
     g = 1.0 :: Int

   and this sequence of actions (assuming you put this program in M1.hs)

     1) ./hugs M1
        Hugs will correctly report 
        ERROR "M1.hs" (line 3): Int is not an instance of class "Fractional"
     2) comment g1 out 
     3) :reload
     4) f 1
        Hugs will die with:
        "
        Program error: {Show_Char_showList_v825 
        INTERNAL ERROR: Error in graph

    From this point on, you're hosed.  The only way to get f to work correctly
    is to quit Hugs and restart.

2) To understand what is going wrong, you have to understand a bit
   about how Hugs builds dictionaries.

   As in all other Haskell compilers, the Hugs typechecker inserts extra
   "dictionary arguments" into overloaded code.  For example, the definition
   of f becomes.
  
      f x = show dict_Show_(Int,Int,Int) (x,x,x)

   In Hugs, the dictionary is constructed by the typechecker and the 
   compiled code will contain pointers to all the dictionaries it uses.

   Since dictionaries are heap-allocated, we have to maintain a list
   of all the dictionaries required by the compiled code --- it would
   be disastrous to garbage collect a dictionary that was still required
   by the compiled code.  
   [Note: the bug is _NOT_ caused by this sort of error.]

   With typical economy, this table of dictionary pointers serves a
   second purpose in Hugs: whenever Hugs is considering creating a
   dictionary for a given Class-Type pair, it searches the dictionary
   table to see if it already has a suitable dictionary.  If so it uses it.

   All is beautiful until Hugs has to unload some compiled code.  Now we
   have to search through the dictionary table deleting:

   1) Any dictionaries which depend on some of the code we're unloading.
   2) Any dictionaries which are required by the code we're unloading.
   3) Any dictionaries which depend on a dictionary we're deleting.

   The bug Carl fixed was a bug which prevented rule 2 from being applied -
   we always keep dictionaries even if we don't need them.

   Huh?  What's the harm in that?  It's a little inefficient but it doesn't
   seem like it should cause Hugs to crash.

   The problem comes if Hugs doesn't finish compilation because it
   hits a type error.  In this case, Hugs has allocated a 
   dictionary, added it to the dictionary table but hasn't
   initialised it yet.  When the typechecker falls over,
   we try to clean up a bit - but were missing the uninitialised
   dictionaries!  After fixing the type error, we reload the
   module - finding that there's no need to build a new
   Show_(Int,Int,Int) dictionary because there seem to be one in
   the table already.  Then at runtime, we discover the hard way
   that the Show_(Int,Int,Int) dictionary hasn't been initialised yet.

3) Carl's patch fixes the problem by making sure that we really do delete
   dictionaries which were created during the compilation of this module.
   His fix is to number dictionaries sequentially and to remember which 
   range of dictionary numbers correspond to each module.  When a module
   is unloaded, you delete all dictionaries associated with that range.
   This approach is like the one already used to manage the symbol table,
   the type table, the class table, etc.

   An obvious variation of this scheme would be to tag each dictionary
   with the lowest module number it is required for.  This would avoid
   introducing the dictHw variable.

   An alternative approach would be to mark each dictionary according to
   whether or not it has been initialised.  When unloading a module,
   we just scan the dictionary table deleting uninitialised dictionaries.
   I've implemented this and it seems to work just as well.
   There is a slight risk that you might hang onto a lot of dictionaries
   which have been initialised and are perfectly valid but aren't needed 
   by any of the code currently loaded into Hugs.  A slight inefficiency.
   In fact, there's no need to mark dictionaries as initialised or
   uninitialised - Hugs keeps a list of uninitialised dictionaries
   (called pendingDicts) - so we just have to delete dictionaries which 
   are still on the pending list.  (I haven't tested this optimisation.)

   Yet another approach would be to (re)initialise every dictionary before
   running any code.  Then it wouldn't matter whether we deleted 
   unneeded dictionaries from the table or not - except for the same
   efficiency reason.  I haven't tested this approach either - but I'm
   pretty sure it would work too.

   Carl's patch is probably the most straightforward and avoids a small
   space leak incurred the other solutions - so I'll stick with it.
No more graph errors?

Reply via email to