Hi, I’m using the GHC API to typecheck 35,000 modules that form a complicated dependency graph (with multiple top-level modules, i.e. there’s no single “god module” that would transitively depend on everything else), and I noticed that peak memory usage is wildly different when everything is done from scratch vs. when everything is loaded from files containing ModIfaces: 17G vs. 8G. This ratio replicates for smaller samples as well, e.g. 80M vs 33M for 407 modules.
I’m aware of https://gitlab.haskell.org/ghc/ghc/-/issues/13586 and so when I finish typechecking a module, I take the resulting ModIface and create the ModDetails that ends up in the HomeUnitGraph from that. My understanding of Matt’s original GHC fix in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5478 is that it does the same, i.e. it only makes a fresh ModDetails only once per module, after the ModIface is ready. But of course that still means that ModDetails can only keep growing as more and more parts of it are used for typechecking more and more dependants. Could that be the cause? I tried a crude experiment of “putting the toothpaste back in the tube” by replacing all ModDetails with a fresh one in the HUG after each finished typechecking , but that’s a complete disaster for memory usage: even for the small 407 module example, the memory usage shoots up to 1.5G. I can imagine it’s because imported Ids are probably not shared anymore between different importer modules. Any ideas on how I could improve memory usage in the from-scratch case, so that it's more similar to the from-ModIface case? Thanks, Gergo
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs