Hi Tim,
thanks for the patch. > From bb29ee8ccc656b86039127b31fd8b79533927053 Mon Sep 17 00:00:00 2001 > From: Timothy Sample <[email protected]> > Date: Wed, 2 Jan 2019 16:40:48 -0500 > Subject: [PATCH] gnu: ghc: Sort packages before writing binary cache. > > This improves the reproducibility of packages built with the Haskell > build system. > > * gnu/packages/haskell.scm (ghc)[arguments]: Add a phase that patches > 'ghc-pkg' so that it sorts packages before generating a binary cache. Okay. > + ;; This phase patches the 'ghc-pkg' command so that it sorts > + ;; the list of packages in the binary cache it generates. > + (add-after 'unpack 'patch-ghc-pkg > + (lambda _ > + (substitute* "utils/ghc-pkg/Main.hs" > + (("import Data.List") > + (string-append "import Data.List\n" > + "import Data.Ord (comparing)")) > + (("pkgsCabalFormat = packages db") > + (string-append "pkgsCabalFormat = sortBy" > + " (comparing (display . installedUnitId))" > + " (packages db)"))))) This sorts the list “pkgsCabalFormat” in “updateDBCache” by the display value of the “installedUnitId” field of each package. According to the documentation at [1], the UnitId type has an Ord instance, so you probably don’t need “display”; you don’t need to sort strings but can sort the UnitId values directly. [1]: https://www.haskell.org/cabal/release/latest/doc/API/Cabal/Distribution-Types-UnitId.html#t:UnitId I’m not sure about using installedUnitId here. Is this field unique? “sourcePackageId” is the combination of package name and version. I don’t understand the UnitId documentation, so I can’t say if that value is any better. I wonder if it would be better to sort the result of “getDirectoryContents” instead. As far as I understand, this is the cause of non-determinism here. The function “readParseDatabase” (which contains the “getDirectoryContents” call) is used in multiple places throughout “ghc-pkg/Main.hs”. The most appropriate line to modify would then be this: confs = map (path </>) $ filter (".conf" `isSuffixOf`) fs where “fs” is the list of FilePath values (strings). I think you can just do this: confs = map (path </>) $ filter (".conf" `isSuffixOf`) (sort fs) because “fs” is of type [FilePath], which is [String], which is sortable via “sort” as String has an Ord instance. What do you think? -- Ricardo
