On Thu, 15 Oct 2020, Todd Fleisher wrote: > Do you mean largest files or largest keys within files? Either way, could you
I cannot analyse the dumps. (I started to wrote a parser but I could not finish it.) So I simply check file sizes. Script computes the average of lengths and standard deviation and it drops the most suspicious files. Then it recomputes again for the rest of files. It iterates a few (10) times. Then it suggest a command that lists files to be deleted. Just copy&paste after visual checking. > share this script with me for my own knowledge and/or future use? I figured > I?d start here with you off-list, but if you think it?s helpful and are > comfortable sharing it with the world feel free to reply on list. Here is the code: -----------------8<------------------8<---------------------- #!/bin/bash dir=/var/lib/sks/dump filestat () { local -a sizes=( $(stat --printf='%s ' "$@" ) ) local count=${#sizes[*]} local totalsize=$(( $(echo ${sizes[*]} | tr ' ' '+') )) local mean=$(( $totalsize/$count )) local stddev=$( ( echo m=$mean # mean echo n=$count # count echo s=0 # sum of (x-m)^2 for s in ${sizes[*]} do echo "s += ($s-m)^2" done echo 'sqrt(s/n)' ) | bc) echo $mean $stddev $count } read mean stddev count < <(filestat $dir/*.pgp) echo $mean $stddev $count for a in $(seq 1 10) ; do maxsize=$(( $mean + 4*$stddev )) normalfiles=$(find $dir -maxdepth 1 -type f -name '*.pgp' -size -${maxsize}c) read mean stddev count < <(filestat $normalfiles) echo $mean $stddev $count done (find $dir -maxdepth 1 -type f -name '*.pgp' -size -${maxsize}c | xargs ls -lsh | sort -h ; find $dir -maxdepth 1 -type f -name '*.pgp' -size +${maxsize}c | xargs ls -lsh | sort -h) | cat -n echo "find $dir -maxdepth 1 -type f -name '*.pgp' -size +${maxsize}c" -----------------8<------------------8<---------------------- Cheers Gabor -- E-mail = m-mail * c-mail ^ 2