On Mon, Jul 2, 2012 at 3:14 PM, Duncan Coutts <duncan.cou...@googlemail.com> wrote: > Something to keep in mind is memory usage. I know Jeremy is looking at > this from the infrastructure side, but I think from the app side there's > also some likely culprits. Cabal's GenericPackageDescription type is > very large in memory. Having 10's of 1000's of these means lots of > memory. One hopefully easy way to save memory here without going to the > hassle of redoing Cabal's type definitions is simply to increase > sharing. There's a huge amount of repeated information. Start by sharing > all the package names and versions. Then there's other meta-data that > rarely changes between versions of the same package. This kind of thing > should be easy to evaluate, just write a test prog that reads the index > file and look at peak memory use. Then try sharing stuff and see how > much it drops. This sharing optimisation would still be useful even if > later we go and redo GenericPackageDescription to be more compact.
This should not hold up the launch of Hackage 2 (which is very important) but I think it's an important issue that we need to address: we don't want to store the perhaps most important data the Haskell community has in an experimental data store! Creating a correct data store (i.e. ACID) that also handles a moderate amount of load is a quite difficult undertaking and it shouldn't be taken lightly. Lets stick the data in some SQL database and spend our energy on other things. :) Cheers, Johan _______________________________________________ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel