Hi! Thanks a lot for detailed survey. I'll comment in-between.
Nathan Gray <[email protected]> writes: > On the (rather large) cap repository, darcs22pre2 used about a third > more memory and took more than three times as long to get. Could you please try running time darcs22pre2 get cap2 cap2-prime and report the user and system times? Also, it would be good to know how many files you have under the "patches" and "pristine.hashed" subdirectories of _darcs in the hashed repository. > For some reason the check and repair benchmarks failed for darcs109, > but darcs22pre2 took more time and used more memory for these on the > version 2 repository than on the version 1 repository. Likewise for > repair. That's an interesting observation as well. I don't have an explanation, since the code doing repair and check is identical for both repository types as far as I can tell. Oh, wait. I probably know. This is closely related to the above question about system and user times. Let me elaborate. Darcs hashed repositories currently store everything directly under a single directory as a flat list. Ie: _darcs/pristine.hashed/hash1 _darcs/pristine.hashed/hash2 _darcs/pristine.hashed/hash3 etc. However, most operating systems and filesystems handle large directories extremely inefficiently. Although one would expect that this would make no difference (theoretical bound doesn't change the slightest), in practice the performance of large directories is orders of magnitude worse. Coupled with darcs global cache, this can make things real real bad. Unfortunately for you, that means that we can't improve the performance for you until a bucketed-hashed repository format is implemented. I will try to get it rolling for 2.3, but can't make any promises. > Pulls are somewhat similar when using a version 1 repository, but use > more than six times as much memory and take about twice as long on the > version 2 repository. This is another thing where high-level optimisation could help. In fact, the same approach that I have used with check and repair would help a lot. You can see that applying 100 patches takes almost a third of the time that applying all of the patches in repository during check/repair in darcs22pre2 takes. [snip numbers] > On the (somewhat smaller) systems repository, getting used less than a > tenth as much memory on the version 2 repository, but took five times > as long. Again, the big directory issue strikes here. If you have global cache enabled (likely with darcs22pre2), this is even worse, as you pay even bigger penalty for every cache access than you pay for accessing repository-local data. > It took less time to check and to repair using darcs22pre2, but used > two to three times as much memory as darcs109. > > Pulling fewer patches on darcs109 was faster, but pulling more patches > was faster using darcs22pre2. Memory usage was similar for all of > the 'pull' benchmarks. [snip more numbers] > I am encouraged that darcs22pre2 on a version 2 repository is performing so > much better than earlier versions of darcs2. I am still concerned that it > uses so much more memory for check and repair, and sometimes pull, and that > it is so much slower for pulls and gets. One part of the high memory usage is that we now have a limit on how much of changed file contents is retained in memory. This is currently hard-coded as 100M. It still doesn't explain the 200+ megabytes we are seeing there though. It wouldn't be hard to reduce memory usage by say 50M on expense of slower check/repair. This is possibly something to be fine-tuned. Once some high-level optimisations are applied (hopefully in time for darcs 2.3), (local) get and pull performance should improve significantly. Yours, Petr. -- Peter Rockai | me()mornfall!net | prockai()redhat!com http://blog.mornfall.net | http://web.mornfall.net "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton on the subject of C program indentation _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
