Thanks for your help Joey,

>> 1. The total number of files in ~/Annex, not including .git, on A
and B is different:

ls -R1 ~/Annex | wc -l

ls -R1 ~/Annex | wc -l

2. git-annex status shows untracked and modified files on both
machines (different files on each machine).

These seem likely to be related. Can you show the status?

Currently on machine A (which has 21845 files) git-annex status outputs nothing. On machine B, which as 15 less files, it lists 13 untracked, 31 deleted and 1 modified file. The output of this command seems to have changed since yesterday on both machines, even though I haven't changed the files and I thought git-annex finished syncing ages ago.

Are you using direct mode, or indirect mode?

Direct mode I think. Both annexes were created using the assistant, and most of the files in both are files, not symlinks.

3. On each machine, 7 files have been replaced with broken symlinks
to files in .git/objects. This time it is the same files on both
machines, so it looks as if git-annex might have lost these files
from both machines. git-annex fsck finds these 7 and reports them as
'No known copies exist'.

You run git annex log on some of these files to see the history of which
repository they were in and how they moved around.

For these files git-annex log outputs nothing, on either machine.

4. Even after running git gc --aggressive --prune and git-annex
dropunused, the .git directories are massive: 23G and 2.5G, for just
~20,000 files.

Are you looking at the sizes of the .git/objects directories, or the
.git/annex/objects directories? (.git/annex/tmp is also a possible place where
cruft could somehow accumulate)

Almost all of the disk usage is in .git/annex/objects/ on both machines.

When you ran git annex dropunused, did it drop something? git annex unused
should not find any unused files if you've just synced 2 directories, and never
deleted any of the files yet.

It did find and drop some unused files, yes.

I think at this point I should probably recover from backup and go back to using unison to synchronize my large files directories. It'll never live up to what git annex promises, but it's a lot easier to understand what's happening. Since almost all the files still seem to be intact, it shouldn't take long to rsync back just the files that got changed or lost.

I'll hang on for a bit, just in case we can get to the bottom of what's happened here with git-annex.
vcs-home mailing list

Reply via email to