Hi Ian, We used acid-state (actually happstack-state) at Silk for our session store. We had the same problems you describe: slow shutdown/startup, high memory usage, unable to inspect the data. We recently switched to an SQL database. Just another data point.
Erik On Thu, Sep 6, 2012 at 8:49 PM, Ian Lynagh <i...@well-typed.com> wrote: > > Hi all, > > I've had a bit of experience with Hackage 2 and acid-state now, and I'm > not convinced that it's the best fit for us: > > * It's slow. It takes about 5 minutes for me to stop and then start the > server. It's actually surprising just how slow it is, so it might be > possible/easy to get this down to seconds, but it still won't be > instantaneous. > > * Memory usage is high. It's currently in the 700M-1G range, and to get > it that low I had to stop the parsed .cabal files from being held in > memory (which presumably has an impact on performance, although I > don't know how significant that is), and disable the reverse > dependencies feature. It will grow at least linearly with the number > of package/versions in Hackage. > > * Only a single process can use the database at once. For example, if > the admins want a tool that will make it easier for them to approve > user requests, then that tool needs to be integrated into the Hackage > server (or talk to it over HTTP), rather than being standalone. > > * The database is relatively opaque. While in principle tools could be > written for browsing, modifying or querying it, currently none exist > (as far as I know). > > * The above 2 points mean that, for example, there was no easy way for > me to find out how many packages use each top-level module hierarchy > (Data, Control, etc). This would have been a simple SQL query if the > data had been in a traditional database, but as it was I had to write > a Haskell program to process all the package .tar.gz's and parse the > .cabal files manually. > > * acid-state forces us to use a server-process model, rather than having > processes for individual requests run by apache. I don't know if we > would have made this choice anyway, so this may or may not be an > issue. But the current model does mean that adding a feature or fixing > a bug means restarting the process, rather than just installing the > new program in-place. > > Someone pointed out that one disadvantage of traditional databases is > that they discourage you from writing as if everything was Haskell > datastructures in memory. For example, if you have things of type > data Foo = Foo { > str :: String, > bool :: Bool, > ints :: [Int] > } > stored in a database then you could write either: > foo <- getFoo 23 > print $ bool foo > or > b <- getFooBool 23 > print b > > The former is what you would more naturally write, but would require > constructing the whole Foo from the database (including reading an > arbitrary number of Ints). The latter is thus more efficient with the > database backend, but emphasises that you aren't working with regular > Haskell datastructures. > > This is even more notable with the Cabal types (like PackageDescription) > as the types and various utility functions already exist - although it's > currently somewhat moot as the current acid-state backend doesn't keep > the Cabal datastructures in memory anyway. > > > The other issue raised is performance. I'd want to see (full-size) > benchmarks before commenting on that. > > > Has anyone else got any thoughts? > > > > On a related note, I think it would be a little nicer to store blobs as > e.g. > 54/54fb24083b14b5916df11f1ffcd03b26/foo-1.0.tar.gz > rather than > 54/54fb24083b14b5916df11f1ffcd03b26 > > I don't think that this breaks anything, so it should be noncontentious. > > > Thanks > Ian > > > _______________________________________________ > cabal-devel mailing list > cabal-devel@haskell.org > http://www.haskell.org/mailman/listinfo/cabal-devel _______________________________________________ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel