On Monday, January 30, 2017 at 6:22:49 PM UTC-6, Victor Zele wrote: > > I can write a shell script to check for PDF duplicates via MD5 sums, but > no way to automate cleaning them out of the Mayan system/DB. > > Just an idea, > Victor >
Hi Victor, by the way, what tools are best to check for duplicates in this scenario? For file system level duplicates there is a tool "fdupes" https://github.com/adrianlopezroche/fdupes I just discovered it through an article on lxer.com It checks md5 sums to see if a directory contains two or more identical files. It seems like a neat idea. The tool itself is 16 years old. I wonder what is your tool of choice for duplicates. Usage: fdupes [options] DIRECTORY... -r --recurse for every directory given follow subdirectories encountered within -R --recurse: for each directory given after this option follow subdirectories encountered within (note the ':' at the end of the option, manpage for more details) -s --symlinks follow symlinks -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior -n --noempty exclude zero-length files from consideration -A --nohidden exclude hidden files from consideration -f --omitfirst omit the first file in each set of matches -1 --sameline list each set of matches on a single line -S --size show size of duplicate files -m --summarize summarize dupe information -q --quiet hide progress indicator -d --delete prompt user for files to preserve and delete all others; important: under particular circumstances, data may be lost when using this option together with -s or --symlinks, or when specifying a particular directory more than once; refer to the fdupes documentation for additional information -N --noprompt together with --delete, preserve the first file in each set of duplicates and delete the rest without prompting the user -I --immediate delete duplicates as they are encountered, without grouping into sets; implies --noprompt -p --permissions don't consider files with different owner/group or permission bits as duplicates -o --order=BY select sort order for output and deleting; by file modification time (BY='time'; default), status change time (BY='ctime'), or filename (BY='name') -i --reverse reverse order while sorting -v --version display fdupes version -h --help display this help message regards Lin -- --- You received this message because you are subscribed to the Google Groups "Mayan EDMS" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
