Apologies,

I had the right bits copy/pasted into a notepad window but for some 
reason ended up with only half of it in the message. Full log below 
minus any lines containing filenames with subjects in them. Both my spam 
and notspam folders currently have just short of 15,000 messages in them.


2012-09-11 22:00:00 RebuildSpamDB-thread rebuildspamdb-version 6.01 
started in ASSP version 2.2.2(12255)

2012-09-11 22:00:00 RebuildSpamDB will create a Hidden Markov Model!

2012-09-11 22:00:00 RebuildSpamDB will create unicode enabled databases.

2012-09-11 22:00:00 RebuildSpamDB process all words as Sequence of UAX 
#29 Grapheme Clusters.

2012-09-11 22:00:00 RebuildSpamDB will use the ASSP_WordStem engine.

2012-09-11 22:00:00 ---ASSP Settings---
2012-09-11 22:00:00 Do Not Collect Messages with RedListed address: Enabled
**Messages with RedListed addresses will be removed from the corpus!**

2012-09-11 22:00:00 Do Not Collect RedRe Messages: Enabled
**Messages matching the RedRe will be removed from the corpus!**

2012-09-11 22:00:00 Use Subject as Maillog Names: True
2012-09-11 22:00:00 Maxbytes: 4000
2012-09-11 22:00:00 RebuildFileTimeLimit: 1 5
2012-09-11 22:00:00 RebuildFileTimeLimit: files will be moved away from 
the corpus, if there processing takes longer than 5 second(s)

2012-09-11 22:00:00 Trashlist cleaning finished, 30 of 106 files deleted

2012-09-11 22:00:00 /usr/local/assp/store/errors/spam
2012-09-11 22:00:00 File Count:    1,422
2012-09-11 22:00:00 Processing... store/errors/spam with 1,422 files
2012-09-11 22:00:00 ignore and remove files older than 2009-12-16 
21:00:00 in folder store/errors/spam
2012-09-11 22:02:22 Imported Files:    1,420
2012-09-11 22:02:22 Finished in 142 second(s)

2012-09-11 22:02:22 /usr/local/assp/store/errors/notspam
2012-09-11 22:02:22 File Count:    1,392
2012-09-11 22:02:22 Processing... store/errors/notspam with 1,392 files
2012-09-11 22:02:22 ignore and remove files older than 2009-12-16 
21:02:22 in folder store/errors/notspam
2012-09-11 22:04:43 Imported Files:    1,390
2012-09-11 22:04:43 Finished in 141 second(s)
2012-09-11 22:04:43 info: corpusnorm after processing store/errors/spam 
and store/errors/notspam is spamwords 996168/ hamwords 1886016 => 
0.528186399267026
2012-09-11 22:04:43 info: require 15163 files from folder store/spam to 
get a fine corpusnorm (1)

2012-09-11 22:04:43 /usr/local/assp/store/spam
2012-09-11 22:04:43 File Count:    15,163
2012-09-11 22:04:43 Processing... store/spam with 14,000 files
2012-09-11 22:35:28 Removed White:    2
2012-09-11 22:35:28 Imported Files:    14,001
2012-09-11 22:35:28 Folder contents exceeded 'MaxFiles'(14000).
2012-09-11 22:35:28 Finished in 1,845 second(s)
2012-09-11 22:35:28 info: require 6506 files from folder store/notspam 
to get a fine corpusnorm (1)

2012-09-11 22:35:28 /usr/local/assp/store/notspam
2012-09-11 22:35:28 File Count:    17,475
2012-09-11 22:35:28 Processing... store/notspam with 6,506 files
2012-09-11 22:48:45 Imported Files:    6,507
2012-09-11 22:48:45 Folder contents exceeded 'MaxFiles'(14000).
2012-09-11 22:48:45 Finished in 797 second(s)

2012-09-11 22:48:45 Rebuild processed 7.97 files per second. Good values 
are 12 files per second and higher. You can speed up the rebuild 
process, using a cached (>=128MB) IO-controller or a RAM-disk with at 
least 1.04 GBbyte for the folder '/usr/local/assp/tmpDB'.

2012-09-11 22:48:45 Generating weighted Bayesian tuplets
2012-09-11 22:49:22 start populating Spamdb with 360,515 records - 
Bayesian check is now disabled!
2012-09-11 22:51:16 Finished populating Spamdb with 360,515 records - 
Bayesian check is now enabled!
2012-09-11 22:51:16 done - Generating weighted Bayesian tuplets

2012-09-11 22:51:16 Bayesian Pairs: 360,515 now in list

2012-09-11 22:51:18 Generating consolidated Hidden-Markov-Model database 
from 5,252,212 record model
2012-09-11 22:56:39 HMM sequences: 2,553,143 now in list

2012-09-11 22:56:39 generating Spamdb.helo records from 7,212 collected 
HELO's
2012-09-11 22:56:59 cleaning old Spamdb.helo records
2012-09-11 22:57:01 done - cleaning old Spamdb.helo records

2012-09-11 22:57:01 HELO Blacklist: 52 new, 547 now in list

2012-09-11 22:57:01 Spam Weight:       7,281,890
2012-09-11 22:57:01 Not-Spam Weight:   4,460,847

2012-09-11 22:57:01 Corpus norm:    1.6324 - (warning: extremely spam heavy)
2012-09-11 22:57:01 Corpus confidence:    0.14082925
2012-09-11 22:57:01 Recommendation: RebuildSpamDB will limit the number 
of used messages in your corpus. Excess files will be ingored.
2012-09-11 22:57:01 Corpus norm should be between 0.6 and 1.4

2012-09-11 22:57:01 Recommendation: You need more not-spam messages in 
the corpus.

2012-09-11 22:57:01 starting auto correction for corpus - delete old 
spam files from store/spam

2012-09-11 22:57:02 info: starting cleanup for to much (old) files in 
folder /usr/local/assp/store/spam - will try to remove 40% of the files 
- will keep at least 4000 files - will keep files younger than 14 days

2012-09-11 22:57:02 Recommendation: You should increase now MaxBytes to 
6000!

2012-09-11 22:57:07 Start populating Hidden Markov Model. HMM-check is 
disabled for this time!
2012-09-11 22:57:07 start populating Hidden Markov Model with 2,553,143 
records!
2012-09-11 23:56:24 Finished populating Hidden Markov Model with 
2,553,143 records!
2012-09-11 23:56:24 Finished populating Hidden Markov Model. HMM-check 
is now enabled again!

2012-09-11 23:56:24 Total processing time: 6,984 second(s)

2012-09-11 23:56:24 Total processing data: 164.11 MByte

2012-09-11 23:56:24 building new GripList records and bounce report
2012-09-11 23:56:24 processing Logfile /usr/local/assp/maillog.txt
2012-09-11 23:56:33 processing Logfile /usr/local/assp/12-09-10.maillog.txt
2012-09-11 23:56:45 processing Logfile /usr/local/assp/12-09-09.maillog.txt
2012-09-11 23:56:47 processing Logfile /usr/local/assp/12-09-08.maillog.txt
2012-09-11 23:56:52 processing Logfile /usr/local/assp/12-09-07.maillog.txt

2012-09-11 23:56:56 skipping bounce report because 'DoNotCollectBounces' 
is switched ON

2012-09-11 23:56:56 Uploading Griplist via Direct Connection
2012-09-11 23:56:57 Submitted 3,369 bytes: 0 IPv6 addresses, 373 IPv4 
addresses

2012-09-11 23:56:57 Trashlist was saved to /usr/local/assp/trashlist.db


On 12/09/2012 09:39, Thomas Eckardt wrote:
> Same to you Colin,
>
> to verify how the new code is working I need at least all the output about
> all the folders and the the resulting corpusnorm.
>
> Thomas
>
>
>
>
> Von:    Colin <a...@lanternhosting.co.uk>
> An:     assp-test@lists.sourceforge.net,
> Datum:  12.09.2012 10:28
> Betreff:        Re: [Assp-test] New version
>
>
>
> I am seeing the same although not quite as bad. It seems to be related
> to the notspam folder and less than half of the files in it being
> processed.
>
> Yesterday I had:
>
> 2012-09-10 23:15:39 Corpus norm:                 0.9749 - (very good -
> balanced)
> 2012-09-10 23:15:39 Corpus confidence:           1.00000000
>
> Now I have:
>
> 2012-09-11 22:04:43 /usr/local/assp/store/spam
> 2012-09-11 22:04:43 File Count:          15,163
> 2012-09-11 22:04:43 Processing... store/spam with 14,000 files
> 2012-09-11 22:35:28 Imported Files:              14,001
> 2012-09-11 22:35:28 Folder contents exceeded 'MaxFiles'(14000).
> 2012-09-11 22:35:28 Finished in 1,845 second(s)
> 2012-09-11 22:35:28 info: require 6506 files from folder store/notspam to
> get a fine corpusnorm (1)
>
> 2012-09-11 22:35:28 /usr/local/assp/store/notspam
> 2012-09-11 22:35:28 File Count:          17,475
> 2012-09-11 22:35:28 Processing... store/notspam with 6,506 files
> 2012-09-11 22:47:45 remove/usr/local/assp/store/notspam/--522681.eml
> corrected spam
> 2012-09-11 22:48:45 Imported Files:              6,507
> 2012-09-11 22:48:45 Folder contents exceeded 'MaxFiles'(14000).
> 2012-09-11 22:48:45 Finished in 797 second(s)
>
> All the best,
> Colin Waring.
>
> On 11/09/2012 20:49, Steve Moffat wrote:
>> Hi
>> I updated to the new release today and rebuildspamdb has ruined my
> corpus confidence. Not too happy with that....
>> Sep-11-12 16:26:55 Spam Weight:                3,904,196
>> Sep-11-12 16:26:55 Not-Spam Weight:   1,950,092
>>
>> Sep-11-12 16:26:55 Corpus norm:             2.0021 - (warning: extremely
> spam heavy)
>> Sep-11-12 16:26:55 Corpus confidence:  0.06224349
>> Sep-11-12 16:26:55 Recommendation: RebuildSpamDB will limit the number
> of used messages in your corpus. Excess files will be ingored.
>> Sep-11-12 16:26:55 Corpus norm should be between 0.6 and 1.4
>>
>> Thanks
>> Steve
>> Steve Moffat
>> Operations Director
>> Optimum IT Solutions
>> Desk:   441 292 8849
>> Mobile: 441 292 8849
>> MSN IM:st...@optimum.bm<mailto:st...@optimum.bm>
>> Web:http://www.optimum.bm<http://www.optimum.bm/>
>>
>>
> ------------------------------------------------------------------------------
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond.
> Discussions
>> will include endpoint security, mobile security and the latest in
> malware
>> threats.http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> _______________________________________________
>> Assp-test mailing list
>> Assp-test@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/assp-test
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>
>
>
> DISCLAIMER:
> *******************************************************
> This email and any files transmitted with it may be confidential, legally
> privileged and protected in law and are intended solely for the use of the
>
> individual to whom it is addressed.
> This email was multiple times scanned for viruses. There should be no
> known virus in this email!
> *******************************************************
>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to