Doc Schneider wrote:
Doc Schneider wrote:
Duncan Findlay wrote:
On Wed, Mar 01, 2006 at 11:50:58PM -0600, Doc Schneider wrote:
is what I'm personally using. The machine is a dual 500 with a gig
of RAM. And perl 5.8.6 on it. Anyone have any ideas?
What size are these mailboxes?
Total size of the files? or how many messages in each mbox?
Size of ham is 333 megs
Size of spam is 535 megs.
A bit over 100k messages total
spams:63731 hams:39385 give or take.
My current checks are roughly 40k messages. I use --after '6 months'
on a much larger corpus, but that gets me down to 40k. (On my weekly
net-enabled runs I use --after '1 month')
It takes roughly 3 hours start to finish (including scanning the
corpus and rsyncing), this is on a 2.8 GHz P4 w/ 1GB RAM.
Suggestions:
- Get rid of --all; you could be hitting some giant messages and
burning a lot of CPU.
- Use -j2 since you have 2 processors... might as well use them.
- Trim your corpus, (use --after)
Trouble is most of my corpus is less than 6 months old. Should I try
--after '3 months' ??
Going to get rid of --all too. and use the -j 2 to see if that speeds
things up some.
Here's what I'm trying now:
./mass-check --progress -n -j 2 --after '3 months' \
ham:mbox:/home/masschecker/mail/ham \
spam:mbox:/home/masschecker/mail/spam
::sigh:: This is what happened.
status: starting scan stage now: 2006-03-02
00:18:26
archive-iterator: no messages to process
Any ideas?
--
-Doc
SA/SARE -- Ninja
12:40am up 39 days, 22:00, 16 users, load average: 0.46, 0.43, 0.42
SARE HQ http://www.rulesemporium.com/