Justin Mason wrote:

as a matter of interest, how much disk space does this database
of as-yet-untrained tokens take up?   It's something we've considered
implementing in SpamAssassin, but the disk space issue is an
important datum before considering it.

This is not for the faint of heart. dspam is a major app which make very heavy use of the backend database (with MySQL being the most common and fastest backend). On a freshly installed server using a dump of my production database, my entire dspam database is 1.7GB, of which the token table is 266M/336M (data/index); I don't have a breakdown of how much of that is untrained vs trained. The signature table is 1.1G/9.0M for comparison.


For comparison, my production server is 7.2GB, on the other hand reflecting the high water mark (since MySQL never shrinks the tables unless optimized manually).

John

--
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4501 Forbes Boulevard
Suite H
Lanham, MD  20706
301-459-3366 x.5010
fax 301-429-5748

Reply via email to