On Sat, Jul 25, 2020 at 12:27:42 +0200, Antonio Muci via Mercurial-devel wrote: > That's sad.
Yeah. This motivated me enough to clone the repos (hg and git) and collect some data. Maybe people here will find it useful. First off, the clone itself. I cloned it from the official upstream repos. My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror. I used hg 4.9.1 (py27), and git 2.21.0. (I know, I need to update both. This is on a box that has a solid network connection but is harder to update. If there is interest I can spend the effort to update them and re-run it with newer versions.) $ hg clone https://hg.openjdk.java.net/jdk/jdk destination directory: jdk requesting all changes adding changesets adding manifests adding file changes added 60318 changesets with 516970 changes to 187542 files new changesets fd16c54261b3:227cd01f15fa updating to branch default 65415 files updated, 0 files merged, 0 files removed, 0 files unresolved This took a total of ~16.3 mins (978 seconds), of which: 1) ~30 seconds were used by "adding changesets" 2) ~8 mins were used by "adding manifests" 3) ~7 mins were used by "adding files" The adding of manifests and files was receiving ~1.0-1.2 MB/s (bytes received on the NIC, *not* actual payload inside TCP and hg specific framing). My box still had plenty of CPU, RAM, and I/O left so I don't know if the 1.0 MB/s was a result of hg being sub-optimal or if the hg server or the network connection were the bottleneck. To rule out internet slowness, I ran 'hg serve' on the clone and did a clone on my laptop (5.5rc0+25-fbc53c5853b0, py3) on the same subnet (wifi connected). It took 495 seconds (2x faster), and I saw slightly higher network utilization (~1.7 MB/s) and the laptop CPU pegged at 100% for pretty much the entire duration of the "adding file changes" portion. (The laptop has an SSD, so that probably helped eliminate some of the slowness - it is a bit of an apples and oranges comparison, but interesting none the less.) Cloning directly from java.net on my laptop took 1400 seconds - so, about 50% slower. This could be because of the wifi, py3 vs. py27, hg version difference, etc., etc. $ git clone https://github.com/openjdk/jdk.git jdk-git Cloning into 'jdk-git'... remote: Enumerating objects: 819, done. remote: Counting objects: 100% (819/819), done. remote: Compressing objects: 100% (577/577), done. remote: Total 1072595 (delta 356), reused 423 (delta 199), pack-reused 1071776 Receiving objects: 100% (1072595/1072595), 414.42 MiB | 6.17 MiB/s, done. Resolving deltas: 100% (800673/800673), done. Checking out files: 100% (65415/65415), done. This took a total of 1 min 49 secs (109 seconds), of which: 1) 1 min 8 secs were used by "receiving objects" 2) 25 seconds were used by "resolving deltas" The receiving of objects was pulling in 6.8 MB/s. Cloning directly on my laptop took 99 seconds with git version 2.26.2. ... > About .hg size (1a): is it really true that .hg is 1.2GB and the > corresponding .git version is 300 MB? Verifying it should not be too > difficult. If it's true (I doubt it), something has to be done. $ du -shA jdk-*/.{hg,git} 1.10G jdk-hg/.hg 452M jdk-git/.git So, both numbers seem to be tweaked to justify migration - at least on a fresh clone - but I'd say hg is worse by 2-3x. The whole checkout in case anyone cares: $ du -shA * 1014M jdk-git 1.65G jdk-hg Now, hg specifics. It looks like the manifest is huge. This corresponds to how long it took to download. -rw-r--r-- 1 jeffpc jeffpc 25.2M Jul 25 12:16 00changelog.d -rw-r--r-- 1 jeffpc jeffpc 3.68M Jul 25 12:01 00changelog.i -rw-r--r-- 1 jeffpc jeffpc 434M Jul 25 12:09 00manifest.d -rw-r--r-- 1 jeffpc jeffpc 3.67M Jul 25 12:09 00manifest.i Not a complete surprised given that there are a lot of files (~65k) tracked and many use the super-long file paths (e.g., test/hotspot/jtreg/runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java). That adds up. Just the paths in the manifest itself add up to almost 4.7MB. $ hg manifest | wc 65415 65415 4694467 I'm guessing that they would have benefited from treemanifest. I also tried to clone locally to see what sort of thing a user would see. $ hg clone jdk-hg test $ git clone jdk-git test-git hg took 60 seconds (with hot cache, ~120 secs cold cache), git took 13 seconds. Git hardlinked the one big pack file, while hg hardlinked each of the file in .hg/store. Obviosly, hardlinking 2 files is much faster than hardlinking ~180k. (treemanifest would have made this even worse for hg.) I just kicked off a conversion to treemanifest. It'll take a while. Jeff. -- Intellectuals solve problems; geniuses prevent them - Albert Einstein _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel