Re: How to start syncing two existing directories with git annex?

2013-11-27 Thread Sean Hammond

On 26.11.2013 15:51, Sean Hammond wrote:

Thanks all. I'm going to try the simplest method: just run git annex
assistant on both machines, drag the music dir into the annex on both
machines, and let it deal with it. Some unnecessary transferring and
temporary on-disk duplication should not be a problem. I've made a
backup first, just in case.


I think it worked. Git annex assistant did a lot of transferring, and
when it was eventually done the number of files and size of the
~/Annex/Music dir is exactly the same on both machines, and the files
look fine.

The ~/Annex/.git dirs are quite big though: 640M and 1.5G. I ran git gc
--aggressive --prune on them, and got them down to 180M and 894M. That
still seems surprisingly large but I guess it's not big enough to cause
a problem.


Ok, I added several folders to the annex and it synced them all. It's 
now stopped syncing, (even after restarting git-annex and 
git-annex-webapp on both machines, it doesn't sync anymore), and if I 
add a text file on A I can see it appear on B and vice-versa, so it 
seems to be working.


But I noticed a number of problems:

1. The total number of files in ~/Annex, not including .git, on A and B 
is different:


ls -R1 ~/Annex | wc -l
21830

ls -R1 ~/Annex | wc -l
21845

2. git-annex status shows untracked and modified files on both machines 
(different files on each machine).


3. On each machine, 7 files have been replaced with broken symlinks to 
files in .git/objects. This time it is the same files on both machines, 
so it looks as if git-annex might have lost these files from both 
machines. git-annex fsck finds these 7 and reports them as 'No known 
copies exist'.


4. Even after running git gc --aggressive --prune and git-annex 
dropunused, the .git directories are massive: 23G and 2.5G, for just 
~20,000 files.


I've tried git-annex fsck and repair but that doesn't seem to have 
helped the problems. I could git-annex add and sync the files that are 
reported as untracked. The ones that are reported as modified, and the 
ones that are broken symlinks, I'm not sure what to do with.

___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home


Re: How to start syncing two existing directories with git annex?

2013-11-27 Thread Joey Hess
Sean Hammond wrote:
 1. The total number of files in ~/Annex, not including .git, on A
 and B is different:
 
 ls -R1 ~/Annex | wc -l
 21830
 
 ls -R1 ~/Annex | wc -l
 21845
 
 2. git-annex status shows untracked and modified files on both
 machines (different files on each machine).

These seem likely to be related. Can you show the status?

Are you using direct mode, or indirect mode?

 3. On each machine, 7 files have been replaced with broken symlinks
 to files in .git/objects. This time it is the same files on both
 machines, so it looks as if git-annex might have lost these files
 from both machines. git-annex fsck finds these 7 and reports them as
 'No known copies exist'.

You run git annex log on some of these files to see the history of which
repository they were in and how they moved around.

For example:

- 2013-11-27 10:01:14 Parallel_and_Concurrent_Programming_in_Haskell.epub | 
d322dff8-8b32-11e0-bbce-bb98bc1ede5b -- wren
+ 2013-11-25 17:40:59 Parallel_and_Concurrent_Programming_in_Haskell.epub | 
7e88d964-437e-47be-885a-e158af656729 -- darkstar
+ 2013-11-25 17:13:34 Parallel_and_Concurrent_Programming_in_Haskell.epub | 
d322dff8-8b32-11e0-bbce-bb98bc1ede5b -- wren
+ 2013-11-25 17:13:34 Parallel_and_Concurrent_Programming_in_Haskell.epub | 
----0001 -- web

So, this would also let you see where the last copy resided, and when it was
recorded as being dropped. Although git-annex should never drop the last copy
of a file on its own in normal circumstances.

 4. Even after running git gc --aggressive --prune and git-annex
 dropunused, the .git directories are massive: 23G and 2.5G, for just
 ~20,000 files.

Are you looking at the sizes of the .git/objects directories, or the
.git/annex/objects directories? (.git/annex/tmp is also a possible place where
cruft could somehow accumulate)

When you ran git annex dropunused, did it drop something? git annex unused
should not find any unused files if you've just synced 2 directories, and never
deleted any of the files yet.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: How to start syncing two existing directories with git annex?

2013-11-26 Thread Sean Hammond

Thanks all. I'm going to try the simplest method: just run git annex
assistant on both machines, drag the music dir into the annex on both
machines, and let it deal with it. Some unnecessary transferring and
temporary on-disk duplication should not be a problem. I've made a
backup first, just in case.


I think it worked. Git annex assistant did a lot of transferring, and 
when it was eventually done the number of files and size of the 
~/Annex/Music dir is exactly the same on both machines, and the files 
look fine.


The ~/Annex/.git dirs are quite big though: 640M and 1.5G. I ran git gc 
--aggressive --prune on them, and got them down to 180M and 894M. That 
still seems surprisingly large but I guess it's not big enough to cause 
a problem.

___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home


Re: How to start syncing two existing directories with git annex?

2013-11-26 Thread Joey Hess
Sean Hammond wrote:
 I think it worked. Git annex assistant did a lot of transferring,
 and when it was eventually done the number of files and size of the
 ~/Annex/Music dir is exactly the same on both machines, and the
 files look fine.
 
 The ~/Annex/.git dirs are quite big though: 640M and 1.5G. I ran git
 gc --aggressive --prune on them, and got them down to 180M and 894M.
 That still seems surprisingly large but I guess it's not big enough
 to cause a problem.

It's possible that using `git annex unused` will find some old files in
the larger of the two.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: How to start syncing two existing directories with git annex?

2013-11-25 Thread Sean Hammond
Thanks all. I'm going to try the simplest method: just run git annex 
assistant on both machines, drag the music dir into the annex on both 
machines, and let it deal with it. Some unnecessary transferring and 
temporary on-disk duplication should not be a problem. I've made a 
backup first, just in case.

___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home


Re: How to start syncing two existing directories with git annex?

2013-11-24 Thread Adam Spiers
On 24 November 2013 20:03, Sean Hammond snh...@gmail.com wrote:
 Hey, I have a ~/Music directory on computer A, and a ~/Music directory on
 computer B. They contain mostly the same files (and with the same paths).
 But there might be some files on A but not B, or vice-versa. And there might
 be some files on both but different (e.g. different id3 tags).

 I want to use git annex assistant to sync the two dirs. Neither computer is
 big enough to hold two copies of the Music dir at once. Ideally, I'd prefer
 not to have to delete the Music dir from computer B, for example, and then
 let git annex sync it from A over to B again.

This is a similar scenario to

http://git-annex.branchable.com/tips/migrating_two_seperate_disconnected_directories_to_git_annex/
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home