Package metadata size stored in the database are now roughly 3 times bigger. The local cache in my home directory has now 2.3GB:
maciej@login [login]:~/src/opencsw-gar/v2 > ls -lh *.db -rw-r--r-- 1 maciej csw 62K Jan 14 10:05 pkg.db -rw-r--r-- 1 maciej csw 3.0M Jan 18 02:03 pkgstats-deps.db -rw-r--r-- 1 maciej csw 2.3G Jan 18 02:03 pkgstats.db We're seeing other problems such as timeouts on the restful interface, out-of-memory errors and quota hits. We have to think if this is an acceptable performance on our buildfarm. There's a number of things that we can do to make things better, but since they all have to do with infrastructural changes, they have to be done carefully. I'm on vacations the next week, so I won't do any changes now or next week. The first realistic data I'm doing anything more involved is around the 9th of February, which is roughly 3 weeks from now. For now, let's talk about what options we have. - change from pickles to json as the mysql-side storage format (should speed up the restful interface, no more unpickling+jsonizing) - storing compressed json is also an option, but will require constant decoding, which can be slow on sparc. We'd have to do some measurements to see what's faster; are we I/O bound or processor bound. We could use a lightweight compression library such as snappy. http://code.google.com/p/snappy/ - split out the single blob with everything (per package) into more blobs per package (quite a bit of coding will be involved; requires the rest interface changes); smaller blobs will be easier to handle and we'll avoid retrieving data we don't want; the price we'll pay is more complexity in the code. I've been so far avoiding complexity as much as possible (contrary to what it might look like :-P ), and having one memory structure for all data (per package) allowed us to keep the code simple in a lot of places. - change the underlying storage, although this would be quite involved and the benefit is not certain; I read that using MySQL as a key-value store can work just fine. Maciej
_______________________________________________ maintainers mailing list [email protected] https://lists.opencsw.org/mailman/listinfo/maintainers .:: This mailing list's archive is public. ::.
