You can find out if you have duplicated content using this little one-liner:
md5sum /var/spool/dl/data/* | awk '{print $1}' | \
sort | uniq -c | awk '{print $1}' | \
sort | uniq -c
it will print an histogram of duplication counts.
On a system with no duplication it will print the file count followed by
"1".
Would you share your statistics with us?
