Asked how she had come to choose GHC as the topic for her
 award-nominated PhD dissertation, freshly graduated doctor of
 software archeology Simone Tolduso revealed:

 "At first, there were a few small curiosities that triggered my
 interest, like why were darcs patches sent to the cvs-ghc
 mailinglist, or why did GHC releases traditionally bundle
 the predecessor of the current Cabal version when the missing
libraries depended on its successor?
 But then I looked into the repository, with its layers on layers of
 build systems, source formats, deprecation warnings, directory
 structure fragments, todo logs, broken builds resulting either from
 OS-tools advancing and playing havoc with the built-in assumptions
 of fragile build configurations or from multiple, partially
 completed, mutually incompatible heart-liver-and-lung transplants
 supporting the newest language extensions (which of course were all
 needed to build the compiler branch supporting said features, and
 whose documentation tended to be spread over user manual, API
 comments, mailing list threads, research papers, plus half a dozen
 different Wikis and ticket trackers), supported by often outdated
 documentation in a never-ending variety of formats, and I knew I had
stumbled onto a goldmine. Not to mention remains of earlier projects (what were fptools, or
 libraries?), a variety of test and compilation languages (including
 Haskell, C, Perl, Python, alongside the usual scripting suspects),
 or the proliferation of sediment layers into user space by the
 simple, but ingeneous, means of binary incompatibility. In spite of
 its comparatively small size, the project was beginning to rival the
 complexities of other Microsoft products of the same period.

 In what seems to have been an attempt to push open source ideas to
 their logical conclusion, you actually had to guess at the right
 combination of versions for a number of independently evolving
 toolchains, libraries, OSes, and use those to bootstrap from a
 consistent snapshot of the compiler, library, and sometimes even
 tool sources, or nothing works - a situation which was later
 increasingly exacerbated by the dispersion of the Haskell Cabal
 replacing coordinated releases. Preliminary mining of the relevant
 mailinglist and bug tracker archives suggests that binary releases
 were mainly public data points used to indicate intermediate states
 of GHC _not_ suitable for specific applications (apart from the
 obligatory Cabal pre-version lacking the new features needed for
 installing the extra libraries, other examples include versions of
 Data.ByteString _not_ based on the famous paper, _not_ supporting
 essential optimisations, or _not_ supporting API safety fixes). So
 there seemed to be no way to avoid direct access to the source
 repositories with their associated build processes and toolchains.

 And let us not forget that, unlike the programmers at the time, we
 are in the fortunate situation of already having complete
 repositories for the pieces and dependencies involved.  Finding
 matching versions is a non-trivial, but essentially combinatorial
 exercise, while for them, the process of building GHC would often
 have involved developing and submitting the patches that make up our
 repositories of all the pieces of software GHC builds depended on.
 We still haven't found the key that enabled the ancients to navigate
 this labyrinth and to keep their toolchains up to date while still
 making any progress in their daily work, not to mention recording
 such progress via darcs (in itself written in Haskell, and not free
 of troubles). Agent-based simulations of developer communities at
 arbitrary slices through the repositories show the majority of
 agents getting stuck in a recursive cycle of installing, debugging,
 and updating dependency chains without ever reaching a productive
 state, so we do know that we are missing some crucial information.

 Several of my correspondents have come to favour the somewhat
 controversial theory that the general programmer in those days
 must have been substantially more intelligent than people are
 today. And it does make sense, in a way - I mean, if anyone had
 been the slightest bit bothered by all this complexity, surely
 someone would have tried to simplify things?

 Of course, my work has not all been happy progress: for instance,
 while there really was an 'evil mangler', the equally persistent
 rumour that GHC was named after some scottish town has turned out
 to be a wild goose chase (cf Appendix GC); my colleagues in dirt
 archeology assure me there was no town called 'glorious'. The
 'real' archeologists, as they call themselves, had a field day
 laughing about my gullability there.  Still, there are so many
 burried treasures in this area - just waiting to be investigated."

 Dr Tolduso is currently working on a follow-on project, "Haskell
 by committee - design and syntax through the ages".

 Dept. of Software Archeology, University of New Atlantis
 (for immediate release)



_______________________________________________
Haskell mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell

Reply via email to