On 7 June 2018 at 22:25, Evan Laforge <qdun...@gmail.com> wrote: > On Thu, Jun 7, 2018 at 1:47 PM, Simon Marlow <marlo...@gmail.com> wrote: > > For loading large amounts of code into GHCi, you want to add -j<n> +RTS > > -A128m where <n> is the number of cores on your machine. We've found that > > parallel compilation works really well in GHCi provided you use a nice > large > > allocation area for the GC. This dramatically speeds up working with > large > > numbers of modules in GHCi. (500 is small!) > > This is a bit of a thread hijack (feel free to change the subject), > but I also have a workflow that involves loading a lot of modules in > ghci (500-700). As long as I can coax ghci to load them, things are > fast and work well, but my impression is that this isn't a common > workflow, and specifically ghc developers don't do this, because just > about every ghc release will break it in one way or another (e.g. by > putting more flags in the recompile check hash), and no one seems to > understand what I'm talking about when I suggest features to improve > it (e.g. the recent msg about modtime and recompilation avoidance). > > Given the uphill battle, I've been thinking that linking most of those > modules into a package and loading much fewer will be a better > supported workflow. It's actually less convenient, because now it's > divided between package level (which require a restart and relink if > they change) and ghci level (which don't), but is maybe less likely to > be broken by ghc changes. Also, all those loaded module consume a > huge amount of memory, which I haven't tracked down yet, but maybe > packages will load more efficiently. > > But ideally I would prefer to continue to not use packages, and in > fact do per-module more aggressively for larger codebases, because the > need to restart ghci (or the ghc API-using program) and do a lengthy > relink every time a module in the "wrong place" changed seems like it > could get annoying (in fact it already is, for a cabal-oriented > workflow). > > Does the workflow at Facebook involve loading tons of individual > modules as I do?
Yes, our workflow involves loading a large number of modules into GHCi. However, we have run into memory issues, which was the reason for the recent work on fixing this space leak: https://phabricator.haskell.org/D4659 As it is, this workflow is OK thanks to Bartosz' work on speedups for large numbers of modules, tweaking the RTS flags as I mentioned and some other fixes we've made in GHCi to avoid performance issues. (all of this is upstream, incidentally). There is probably low-hanging fruit to be had in reducing the memory usage of GHCi, nobody has really attacked this with the heap profiler for a while. However, I imagine at some point loading everything into GHCi will become unsustainable and we'll have to explore other strategies. There are a couple of options here: - pre-compile modules so that GHCi is loading the .o instead of interpreted code - move some of the code into pre-compiled packages, as you mentioned Cheers Simon > > Or do they get packed into packages? If it's the > many modules, do you have recommendations making that work well and > keeping it working? If packages are the way you're "supposed" to do > things, then is there any idea about how hard it would be to reload > packages at runtime? If both modules and packages can be reloaded, is > there an intended conceptual difference between a package and an > unpackaged collection of modules? To illustrate, I would put packages > purely as a way to organize builds and distribution, and have no > meaning at the compiler level, which is how I gather C compilers > traditionally work (e.g. 'cc a.o b.o c.o' is the same as 'ar abc.a a.o > b.o c.o; cc abc.a'). But that's clearly not how ghc sees it! > > > thanks! >
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs