Ludovic Courtès <l...@gnu.org> skribis: > As reported by Tobias on IRC (in the context of ‘hpcguix-web’), > checkouts managed by Guile-Git appear to grow beyond reason. As an > example, here’s the same ‘.git’ managed with Guile-Git and with Git: > > $ du -hs > ~/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq > 6.7G > /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq > $ du -hs .git > 517M .git
More data… The biggest file in that repo is a pack that was created when that repo was first cloned (Aug. 2021): --8<---------------cut here---------------start------------->8--- $ du /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/* |sort -k1 -n| tail -3 44272 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-3c2f1857501b01c321bc67ba1f30704deb9e18e9.pack 47272 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-30d5b35ad14a8398464e49e224811b162f673d66.pack 191492 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack $ ls -l /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack -r--r--r-- 1 ludo users 196079671 Aug 9 2021 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack $ ls -ld /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/config -rw-r--r-- 1 ludo users 266 Aug 9 2021 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/config --8<---------------cut here---------------end--------------->8--- The pack starts with things from Aug. 2021: --8<---------------cut here---------------start------------->8--- $ git show-index < pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.idx|sort -k1 -n|head -3 12 30289f4d4638452520f52c1a36240220d0d940ff (852d8cb3) 927 d7ffc535c52f49177a8e5553569cdb1e321b5bc6 (2007c5d0) 1800 0a379de3249d5e9ff66fb404f7e5aa8ce2cb3d24 (b1e69aa4) $ git show 30289f4d4638452520f52c1a36240220d0d940ff commit 30289f4d4638452520f52c1a36240220d0d940ff Author: Milkey Mouse <milkeymouse@meme.institute> Date: Sun Aug 8 22:15:40 2021 -0700 […] --8<---------------cut here---------------end--------------->8--- … and at the bottom (large offsets) it contains very old blogs from the Nix repo that somehow made it here. I figured we still had a ‘nix’ branch from the early days, that contains the history of Nix. I’ve now removed it, which helps a bit: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> ,use(git) scheme@(guile-user)> ,t (clone "https://git.savannah.gnu.org/git/guix.git" "/tmp/guix") $5 = #<git-repository 91a7b0> ;; 600.534529s real time, 435.260926s run time. 0.000000s spent in GC. scheme@(guile-user)> ,t (clone "https://git.savannah.gnu.org/git/guix.git" "/tmp/guix-after-removing-nix-branch") $6 = #<git-repository 4465a50> ;; 420.321511s real time, 398.772963s run time. 0.000000s spent in GC. --8<---------------cut here---------------end--------------->8--- … and more importantly: --8<---------------cut here---------------start------------->8--- $ du -hs /tmp/guix/.git 373M /tmp/guix/.git $ du -hs /tmp/guix-after-removing-nix-branch/.git 362M /tmp/guix-after-removing-nix-branch/.git --8<---------------cut here---------------end--------------->8--- Anyway, what seems to happen is that every pull (every call to ‘remote-fetch’) creates a new pack (see ‘git_fetch_download_pack’ in libgit2), which becomes inefficient in the long run (lots of small poorly-compressed packs). That’s at least one possible explanation. To be continued… Ludo’.