Doc Schneider wrote:
Duncan Findlay wrote:
On Wed, Mar 01, 2006 at 11:50:58PM -0600, Doc Schneider wrote:
is what I'm personally using. The machine is a dual 500 with a gig of RAM. And perl 5.8.6 on it. Anyone have any ideas?
What size are these mailboxes?

Total size of the files? or how many messages in each mbox?

Size of ham is 333 megs
Size of spam is 535 megs.

A bit over 100k messages total
spams:63731    hams:39385 give or take.

My current checks are roughly 40k messages. I use --after '6 months'
on a much larger corpus, but that gets me down to 40k. (On my weekly
net-enabled runs I use --after '1 month')

It takes roughly 3 hours start to finish (including scanning the
corpus and rsyncing), this is on a 2.8 GHz P4 w/ 1GB RAM.

Suggestions:

- Get rid of --all; you could be hitting some giant messages and burning a lot of CPU.
 - Use -j2 since you have 2 processors... might as well use them.
 - Trim your corpus, (use --after)


Trouble is most of my corpus is less than 6 months old. Should I try --after '3 months' ??

Going to get rid of --all too. and use the -j 2 to see if that speeds things up some.



Here's what I'm trying now:

  ./mass-check --progress -n  -j 2 --after '3 months' \
ham:mbox:/home/masschecker/mail/ham \
spam:mbox:/home/masschecker/mail/spam

--
 -Doc

 Penguins: Do it on the ice.
  12:16am  up 39 days, 21:36, 16 users,  load average: 0.23, 0.42, 0.49

 SARE HQ  http://www.rulesemporium.com/

Reply via email to