I knocked this up which may be of use (it symlinks dups to the originals,
but it's easy enough to mod). It recognises symlinks as being different from
files too.

On Jul 23, 2009 6:50 PM, "Ralph Corderoy" <ra...@inputplus.co.uk> wrote:


Hi Peter,

> My thoughts were to do an 'Ls -something' piped into a file, then >
perhaps if I could do a sort ...
What do you mean by `duplicate'?  If you mean you want to group all
files called `foo' together, regardless of their possible differing size
or content then

   find foo bar -type f -printf '%h %f\n' |
   rev | sort | rev |
   uniq -f 1 -D

will list the directory path and filename for all files under `foo' and
`bar' that occur more than once by name, e.g. README is a prime
contender.

It won't work well with paths or filenames with spaces in or other weird
characters, but then that's why you shouldn't have them.  :-)

On the other hand, if you want to find files that almost certainly have
the same content regardless of their file name then

   find foo bar -type f -print0 |
   xargs -r0 sha1sum |
   sort | uniq -D -w 40

lists those.  Note, it won't realise that two files may be hard or
symbolically linked together.

Cheers,


Ralph.

-- Next meeting: Bournemouth, Wednesday 2009-08-05 20:00 Dorset LUG:
http://dorset.lug.org.uk/ Ch...
-- 
Next meeting: Bournemouth, Wednesday 2009-08-05 20:00
Dorset LUG: http://dorset.lug.org.uk/
Chat: http://www.mibbit.com/?server=irc.blitzed.org&channel=%23dorset
List info: https://mailman.lug.org.uk/mailman/listinfo/dorset

Reply via email to