first I just wanted to report that I have a git-annex repo that is really big
and slow and that this makes me kind of unhappy. Then I realized, that it may
be a good idea to add a "diagnostics" command to git-annex that will gather
all informations useful for you to improve git-annex, e.g. for my repo:
time git status
4.34s real 0.07s user 0.15s system 13 maxmem/kb 36384 nrInOps 5% CPU
find .git/annex/objects -type f | wc -l
find . -type l -a \( -path ".git" -prune -o -print \) | wc -l
find .git/objects -type f | wc -l
time git annex fsck --fast | grep -A 10 -v "ok$"
1200.66s real 45.35s user 5.86s system 156 maxmem/kb 301856 nrInOps 4%
The last one is the annoying one. It takes 1200sec=20min to do an annex fsck
--fast over the repo.
git gc --aggressive
Counting objects: 1067858, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (1063155/1063155), done.
Writing objects: 100% (1067858/1067858), done.
Total 1067858 (delta 856150), reused 165564 (delta 0)
Removing duplicate objects: 100% (256/256), done.
Checking connectivity: 1067858, done.
I didn't take the time for the last call, but it was well over an hour.
Thomas Koch, http://www.koch.ro
vcs-home mailing list