On Tue, Sep 20, 2016 at 10:52:10PM +0200, Ahmad Samir wrote:
> One last try (sometimes an issue nags):
> $ find A -exec md5sum '{}' + > a-md5
> $ find B -exec md5sum '{}' + > b-md5
> $ cat a-md5 b-md5 > All
> $ sort -u -k 1,1 All > dupes
> Now, (I hopefully got my head around it this time...), the dupes file
> should contain a list of files that exist in _both_ A and B; but every
> two files that have the same md5sum will have _only one_ of them
> listed (either in A OR B). So if you delete that list of files you
> should end up with only unique files in both locations.

At the start ISTR you said the two directory trees were different.
I took that to mean that two files with identical contents could
be in different directories within the two trees.

If I was wrong in that assumption and each pair of identical
files would be in the same relative path I have two suggestions.

1. Sort a-md5 and b-md5
   Use the comm(1) command.  It will give lines in both files,
   in file a-md5 only and in b-md5 only with 0, 1, or 2 tabs.
   You can also use options to get the 3 columns individually.
   To do this you would have cd to A or B and run the find cmds
   as "find .", not "find A or B".

2. Get a copy of an old program called dircmp* and run it on the
   two trees directly.  It will output files only in tree A,
   only in tree B, then output files in both noting whether
   they are the same or different contents.

I don't have the compiled version of dircmp, but I have a ksh
shell script version that is quite similar.

Jon H. LaBadie                  jo...@jgcomp.com
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org

Reply via email to