Interesting. I tried "git gc --aggressive" on the Mark's converted repository, and the result is: netbeans-import/.git$ du -hs . 792M .
The original was: netbeans-import.git $ du -hs . 3,5G . (IIRC Mark was converting http://hg.netbeans.org/main, not releases, so the repository is a little bit smaller than the releases one.) I tried: $ git log -p | sha1sum on both repositories, and the hashes appear to be the same. I also tried to clone the gc-ed repository using git clone --bare --no-local, and the resulting repository is still about the same size. So, this seems good to me, unless there is some downside I don't know about. Jan On Wed, Nov 23, 2016 at 8:26 PM, Emilian Bold <emilian.b...@gmail.com> wrote: > Actually I don't believe the data loss is that large. (There may also be > mercurial commits that are intentionally ignored by the conversion script, > like commits that only add tags?) > > hg log | grep '^changeset:' | wc -l > 313209 > > git log | grep '^commit ' | wc -l > 301478 > > So there is a difference of 11731 commits (about 4%) but those couldn't > have such a large impact on repository size. > > I hope somebody else is willing to work with me on this so we document > everything and do a reproducible repository conversion. > > > > --emi > > On Wed, Nov 23, 2016 at 9:10 PM, Emilian Bold <emilian.b...@gmail.com> > wrote: > > > Well, I dunno what black magic `gc --aggressive` does but the repository > > is 0.85GB now! > > > > I also ran `git reflog expire` first but it didn't change the size at > all. > > > > One thing to keep in mind is that I used --force although I had 6 commits > > with the warning "repository has at least one unnamed head". Which were > > probably all close branch commits (hg commit --close-branch). > > > > So I might have have data loss(!) since I believe I read > hg-fast-export.sh > > picks only one unnamed head as the migration winner. I wonder if the gc > > command didn't just purge a lot of valid commits from such an unnamed > head > > and that's why the repository became so small. > > > > Could somebody else try a test repository conversion and validate my > > numbers? > > > > git gc --aggressive --prune=now > > Counting objects: 4085031, done. > > Delta compression using up to 8 threads. > > Compressing objects: 100% (2909203/2909203), done. > > Writing objects: 100% (4085031/4085031), done. > > Total 4085031 (delta 2150468), reused 1585934 (delta 0) > > Checking connectivity: 4085031, done. > > > > > > > > --emi > > > > On Wed, Nov 23, 2016 at 7:59 PM, Paul Merlin <paulmer...@apache.org> > > wrote: > > > >> Hi Emilian, > >> > >> > I see hg-fast-export.sh finished at some point. > >> > > >> > As expected though, git does not have any of the disk space gains. The > >> > converted git releases/ repository is 3.6GB. > >> > >> Just a thought. > >> Did you try some git cleanups after the conversion? > >> > >> git reflog expire --expire=now --all > >> git gc --aggressive --prune=now > >> > >> Cheers > >> > >> > >> > In case these statistics mean something: > >> > > >> > git-fast-import statistics: > >> > --------------------------------------------------------------------- > >> > Alloc'd objects: 4090000 > >> > Total objects: 4085509 ( 40220100 duplicates ) > >> > blobs : 1036365 ( 28386238 duplicates 858087 deltas > of > >> > 969684 attempts) > >> > trees : 2735935 ( 11833862 duplicates 1370606 deltas > of > >> > 2613480 attempts) > >> > commits: 313209 ( 0 duplicates 0 deltas > of > >> > 0 attempts) > >> > tags : 0 ( 0 duplicates 0 deltas > of > >> > 0 attempts) > >> > Total branches: 1283 ( 346 loads ) > >> > marks: 1048576 ( 313209 unique ) > >> > atoms: 124011 > >> > Memory total: 218429 KiB > >> > pools: 26711 KiB > >> > objects: 191718 KiB > >> > --------------------------------------------------------------------- > >> > pack_report: getpagesize() = 4096 > >> > pack_report: core.packedGitWindowSize = 1073741824 > >> > pack_report: core.packedGitLimit = 8589934592 > >> > pack_report: pack_used_ctr = 39000045 > >> > pack_report: pack_mmap_calls = 733040 > >> > pack_report: pack_open_windows = 4 / 7 > >> > pack_report: pack_mapped = 4280730006 / 6950823920 > >> > --------------------------------------------------------------------- > >> > > >> > > >> > --emi > >> > > >> > On Fri, Nov 18, 2016 at 1:32 PM, Emilian Bold <emilian.b...@gmail.com > > > >> > wrote: > >> > > >> >> A releases/ clone which on my system takes 3.8GB is reduced to 1.6GB > >> with > >> >> the generaldelta and aggressivemergedeltas flags (took about 14 > hours). > >> >> > >> >> Pretty impressive! > >> >> > >> >> Converting to git with hg-fast-export.sh complains that "repository > >> has at > >> >> least one unnamed head" for about 6 revisions. With --force I'm able > to > >> >> start the conversion but it hasn't finished yet. > >> >> > >> >> The git conversion is about 35% done and already using 1.3GB. > >> >> > >> >> So... I assume it's going to need just like the original repository > >> about > >> >> 3.8GB. > >> >> > >> >> I wonder if git has similar space-saving tricks? > >> >> > >> >> > >> >> > >> >> --emi > >> >> > >> >> On Thu, Nov 17, 2016 at 8:46 AM, Emilian Bold < > emilian.b...@gmail.com> > >> >> wrote: > >> >> > >> >>> Forgot about this. I've just started the Mercurial repository > >> conversion > >> >>> which will take a few hours. > >> >>> > >> >>> Will report tomorrow or when it's done. > >> >>> > >> >>> > >> >>> --emi > >> >>> > >> >>> On Wed, Nov 16, 2016 at 11:18 PM, cowwoc <cow...@bbs.darktech.org> > >> wrote: > >> >>> > >> >>>> Hi Emilian, > >> >>>> > >> >>>> Any update on this? > >> >>>> > >> >>>> Thanks, > >> >>>> Gili > >> >>>> > >> >>>> > >> >>>> On 2016-11-11 01:33 (-0500), Emilian Bold <e...@gmail.com> wrote: > >> >>>>> Thank you for following through with this after we talked on IRC.> > >> >>>>> > >> >>>>> I will check later the size reduction for the releases/ repo.> > >> > > >> > > > > >