On Mon, Sep 23, 2002 at 02:06:51PM +1000, Robin Whittle wrote:
> Thanks for this Vincent.  You wrote:
> 
> gives results in bytes - but it is for blocks allocated.  So an empty
> directory on my system appears as 4096 bytes.  This is a useful
> statistic, but not the main one I want.
> 
> I don't know of a way in Linux / Unix to traverse subdirectories to
> count all the file sizes, without writing a script to do so.  Then, one
> might want to be careful about following symlinks to files outside that
> directory structure, and to directories.

This will tell you exactly how many bytes are being used in all the
files beneath <directory>:

find <directory> -type f -printf '%s\n' | awk '{ t+=$1 } END { print t }'

To follow symlinks, throw a '-follow' after the '-type f' in the above.
With find, you can also specify things like '-mount' to avoid traversing
other filesystems, and '-maxdepth' to restrict directory descension.
Try 'man find'.


> On reflection, I realised that what I really wanted to know was how much
> each Maidir (mailbox from my point of view as a user) was taking up in
> the resulting tar.gz backup file.

The only way to know exactly how big the resulting .tar.gz will be is to
create it. Compression will vary depending on what you're compressing,
and the 'find' command above will not tell you how big the .tar will be,
because more information is stored than just the contents of the files
(e.g. the file names and paths).


> Ideally I would like to know for each Maildirs:
> 
>  1 - The number of messages (though I can already find this via the
>      client).

find Maildir -type f | wc


>  2 - The space taken on disc - with the du command.
>
>  3 - The total size of the messages.

find Maildir -type f -printf '%s\n' | awk '{ t+=$1 } END { print t }'


>  4 - Its size after being tarred and gzipped.

Gotta create it to know.


> The last one is what I am really concerned with.  It can be done with:
> 
>    tar --totals -cz <directory> | wc -c 
> 
> This causes tar to pass its output through gzip and then to "wc" word
> count, with the -c option to show only the number of bytes, not lines
> and words.

Just tried this: The output doesn't quite match the actual tar.gz if
you create one from the same directory. If you're going to do things
this way, you might as well create a temporary file, look at the size,
then delete it.


> Or with the du command and more simply, without tar's "Bytes written"
> figure, which only approximates the total size of the files:
> 
>    du -mb --max-depth=0 <directory>
>    tar --totals -cz <directory> | wc -c 
>
> Now, I want to invent a Perl script to do this for all my Maildirs in
> alphabetical order, and create a nice little body of text of their name,
> their disk usage in 1024 byte blocks, and their tar-gzipped size, with
> commas for thousands and millions to make it easier for me to read. 
> Then, to make it easier to use when my mind is in email mode, rather
> than Linux command line mode, I could set up a Maildrop filter rule to
> look out for an email from myself with a special subject line.  That
> rule would run the Perl script and the resulting email would contain the
> report on mailbox sizes.

If your Maildirs are in a standard location, such as /mail/<user>/Maildir,
you can do it with a simple shell script:

#!/bin/bash

tmpfile=/tmp/compressedmaildir.${RANDOM}.tgz

for maildir in /home/*/Maildir; do
    echo "Report for ${maildir}"
    echo -n "    Total files: "
    find ${maildir} -type f | wc -l
    echo -n "    Total of all file sizes: "
    find ${maildir} -type f -printf '%s\n' | \
        awk 'BEGIN { t=0 } { t+=$1 } END {
          if (t<1024) print t,"bytes";
          else if (t<1048576) print t/1024,"kB";
          else if (t<1073741824) print t/1048576,"MB";
          else print t/1073741824,"GB";
        }'
    echo -n "    Total disk space used: "
    du -sh ${maildir} | cut -f1
    echo -n "    Size of compressed archive: "
    tar -czf ${tmpfile} ${maildir} 2>/dev/null
    ls -lh ${tmpfile} | awk '{ print $5 }'
    echo
done

rm -f ${tmpfile}

exit 0




-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
courier-users mailing list
[EMAIL PROTECTED]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to