On Fri, Sep 04, 2015 at 10:20:01AM +0200, Adam Wolk wrote:
| Hi misc@
|
| I upgraded my mail server to an amd64 snapshot from Sep 2nd and found
| the server stuck delivering mail in the morning with spamassasin
| churning at 90% CPU usage.
|
| Quick investigation lead me to a huge bayes_toks file of 65.3G in
| /var/spampd/.spamassasin/.
|
| $ ls -alh
| total 4738352
| drwx------ 2 _spampd _spampd 512B Sep 4 10:00 .
| drwxr-xr-x 3 _spampd _spampd 512B Sep 3 15:57 ..
| -rw------- 1 _spampd _spampd 36B Sep 4 09:53 bayes.lock
| -rw------- 1 _spampd _spampd 9.8M Sep 3 22:52 bayes_seen
| -rw------- 1 _spampd _spampd 65.3G Sep 3 22:55 bayes_toks
|
| $ file
| bayes_toks bayes_toks: Berkeley DB 1.85 (Hash, version 2, native
| byte-order)
|
|
| Interestingly I don't see that much space used with df (anyone knows
| why?):
You should read up on sparse files. Here's a quick trick from the
sparse files book of tricks:
# First we create a file 'bigfile' using dd:
[weerd@despair] $ dd if=/dev/zero of=bigfile bs=1048576 count=10 seek=1024
10+0 records in
10+0 records out
10485760 bytes transferred in 0.178 secs (58799094 bytes/sec)
# ls will tell us how big this file is:
[weerd@despair] $ ls -lh bigfile
-rw-r--r-- 1 weerd weerd 1.0G Sep 4 19:51 bigfile
# du will tell us how much space is in use by this file:
[weerd@despair] $ du -sh bigfile
10.1M bigfile
# cp is even better at the sparse files game:
[weerd@despair] $ cp bigfile bigfile2
# bigfile2 is the same as bigfile:
[weerd@despair] $ ls -lh bigfile2
-rw-r--r-- 1 weerd weerd 1.0G Sep 4 19:54 bigfile2
# No, really .. exactly the same:
[weerd@despair] $ md5 bigfile*
MD5 (bigfile) = 5ec6988d232a445bc40b9dca003b95f7
MD5 (bigfile2) = 5ec6988d232a445bc40b9dca003b95f7
# However, it uses a lot less disk space:
[weerd@despair] $ du -sh bigfile2
48.0K bigfile2
TL;DR: files with lots of emptiness (consecutive ranges of all 0 data)
are efficiently stored using "sparse files"
| $ df -h
| Filesystem Size Used Avail Capacity Mounted on
| /dev/sd0a 1008M 90.1M 868M 9% /
| /dev/sd0k 9.8G 80.3M 9.3G 1% /home
| /dev/sd0d 3.9G 118K 3.7G 0% /tmp
| /dev/sd0f 3.9G 1.0G 2.7G 28% /usr
| /dev/sd0g 1001M 212M 738M 22% /usr/X11R6
| /dev/sd0h 9.8G 572M 8.8G 6% /usr/local
| /dev/sd0j 3.9G 2.0K 3.7G 0% /usr/obj
| /dev/sd0i 2.0G 2.0K 1.9G 0% /usr/src
| /dev/sd0e 598G 4.3G 564G 1% /var
|
| I removed the file and disk usage dropped by 2.3G on /var.
|
|
| Did anyone experience issues with spamassasin/spampd similar to the
| one reported above?
|
| p5-Mail-SpamAssassin-3.4.1p2 (installed)
| spampd-2.30p3 (installed)
|
| After deleting the file, restarting the service processing a single
| email brought the DB to reported size 37.9M, few emails later it's
| already reported as 113M I have a hunch that it will bloat again really
| fast.
|
| Regards,
| Adam
|
--
>++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
+++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
http://www.weirdnet.nl/