Hi, Just ran rebuildspamdb with the new release. The results are even 
worse.....before this I had a perfect corpus....

Steve

-----Original Message-----
From: assp@assp.local [mailto:assp@assp.local] 
Sent: Wednesday, September 12, 2012 1:50 PM
To: Steve Moffat
Subject: RebuildSpamDB - report from assp.isp.bm

File rebuildrun.txt follows:



Sep-12-12 13:18:25 RebuildSpamDB-thread rebuildspamdb-version 6.02 started in 
ASSP version 2.2.2(12256)

Sep-12-12 13:18:25 RebuildSpamDB will create a Hidden Markov Model!

Sep-12-12 13:18:25 RebuildSpamDB will include attachment-database-entries in to 
spamdb!

Sep-12-12 13:18:25 RebuildSpamDB will create unicode enabled databases.

Sep-12-12 13:18:25 RebuildSpamDB process all words as Sequence of UAX #29 
Grapheme Clusters.

Sep-12-12 13:18:25 RebuildSpamDB will use the ASSP_WordStem engine.

Sep-12-12 13:18:25 ---ASSP Settings---
Sep-12-12 13:18:25 Do Not Collect RedRe Messages: Enabled **Messages matching 
the RedRe will be removed from the corpus!**

Sep-12-12 13:18:25 Use Subject as Maillog Names: True
Sep-12-12 13:18:25 Maxbytes: 4000
Sep-12-12 13:18:25 RebuildFileTimeLimit: 1 5
Sep-12-12 13:18:25 RebuildFileTimeLimit: files will be moved away from the 
corpus, if there processing takes longer than 5 second(s) 

Sep-12-12 13:18:25 C:/assp/errors/spam
Sep-12-12 13:18:25 File Count:  319
Sep-12-12 13:18:25 Processing... errors/spam with 319 files
Sep-12-12 13:18:25 ignore and remove files older than Dec-17-09 12:18:25 in 
folder errors/spam
Sep-12-12 13:18:33 1 attachment/image entries processed
Sep-12-12 13:18:33 Imported Files:      317
Sep-12-12 13:18:33 Finished in 8 second(s)

Sep-12-12 13:18:33 C:/assp/errors/notspam
Sep-12-12 13:18:33 File Count:  113
Sep-12-12 13:18:33 Processing... errors/notspam with 113 files
Sep-12-12 13:18:33 ignore and remove files older than Dec-17-09 12:18:33 in 
folder errors/notspam
Sep-12-12 13:18:40 26 attachment/image entries processed
Sep-12-12 13:18:40 Imported Files:      111
Sep-12-12 13:18:40 Finished in 7 second(s)
Sep-12-12 13:18:40 warning: missing information for automatic corpus correction 
in file C:/assp/normfile - rerun the rebuild, if you see this warning the first 
time!

Sep-12-12 13:18:40 C:/assp/spam
Sep-12-12 13:18:40 File Count:  4,363
Sep-12-12 13:18:40 Processing... spam with 4,363 files
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--140013.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--144011.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--145936.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--172792.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:20:07 remove 
C:/assp/spam/FW_Time_Clarification_Walk_the--81794.eml WhiteList: 
'busbysu...@hotmail.com'
Sep-12-12 13:22:50 Removed White:       5
Sep-12-12 13:22:50 481 attachment/image entries processed
Sep-12-12 13:22:50 Imported Files:      4,356
Sep-12-12 13:22:50 Finished in 250 second(s)

Sep-12-12 13:22:50 C:/assp/notspam
Sep-12-12 13:22:50 File Count:  12,640
Sep-12-12 13:22:50 Processing... notspam with 12,000 files
Sep-12-12 13:42:28 2,022 attachment/image entries processed
Sep-12-12 13:42:28 Imported Files:      12,001
Sep-12-12 13:42:28 Folder contents exceeded 'MaxFiles'(12000). 
Sep-12-12 13:42:28 Finished in 1,178 second(s)

Sep-12-12 13:42:28 Rebuild processed 11.63 files per second.

Sep-12-12 13:42:28 Generating weighted Bayesian tuplets
Sep-12-12 13:42:38 start populating Spamdb with 175,796 records - Bayesian 
check is now disabled!
Sep-12-12 13:43:45 Finished populating Spamdb with 175,796 records - Bayesian 
check is now enabled!
Sep-12-12 13:43:45 done - Generating weighted Bayesian tuplets

Sep-12-12 13:43:45 Bayesian Pairs: 175,796 now in list

Sep-12-12 13:43:45 Generating consolidated Hidden-Markov-Model database from 
1,634,405 record model
Sep-12-12 13:45:16 HMM sequences: 800,876 now in list

Sep-12-12 13:45:16 generating Spamdb.helo records from 3,664 collected HELO's
Sep-12-12 13:45:16 cleaning old Spamdb.helo records
Sep-12-12 13:45:17 done - cleaning old Spamdb.helo records

Sep-12-12 13:45:17 HELO Blacklist: 3 new, 94 now in list

Sep-12-12 13:45:17 Spam Weight:    1,598,969
Sep-12-12 13:45:17 Not-Spam Weight:   4,554,517

Sep-12-12 13:45:17 Corpus norm: 0.3511 - (warning: extremely ham heavy)
Sep-12-12 13:45:17 Corpus confidence:   0.13526783
Sep-12-12 13:45:17 Recommendation: RebuildSpamDB will limit the number of used 
messages in your corpus. Excess files will be ingored.
Sep-12-12 13:45:17 Corpus norm should be between 0.6 and 1.4

Sep-12-12 13:45:17 Recommendation: You need more spam messages in the corpus.

Sep-12-12 13:45:17 starting auto correction for corpus - delete old ham files 
from notspam

Sep-12-12 13:45:22 info: starting cleanup for to much (old) files in folder 
C:/assp/notspam - will try to remove 40% of the files - will keep at least 4000 
files - will keep files younger than 14 days
info: deleted 1646 old files from folder C:/assp/notspam

Sep-12-12 13:45:22 Recommendation: You should reduce now MaxBytes to 2500!  

Sep-12-12 13:45:27 Start populating Hidden Markov Model. HMM-check is disabled 
for this time!
Sep-12-12 13:45:28 start populating Hidden Markov Model with 800,876 records!
Sep-12-12 13:49:06 Finished populating Hidden Markov Model with 800,876 records!
Sep-12-12 13:49:06 Finished populating Hidden Markov Model. HMM-check is now 
enabled again!

Sep-12-12 13:49:06 Total processing time: 1,841 second(s)

Sep-12-12 13:49:06 Total processing data: 567.41 MByte

Sep-12-12 13:49:06 building new GripList records and bounce report
Sep-12-12 13:49:06 processing Logfile C:/assp/logs/maillog.txt

Sep-12-12 13:49:11 skipping bounce report because 'DoNotCollectBounces' is 
switched ON

Sep-12-12 13:49:12 Uploading Griplist via Direct Connection
Sep-12-12 13:49:13 Submitted 2,910 bytes: 0 IPv6 addresses, 322 IPv4 addresses

Sep-12-12 13:49:13 Trashlist was saved to C:/assp/trashlist.db
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to