Should I clear out the collections and start over? It would be much 
easier to train assp with old known-good email if it would accept more 
than one attachment. What is it that is preventing assp from processing 
multiple attachments?

Additionally, if I have this 3:1 ratio, should I get assp to only 
collect every 3rd spam? Once I hit maxfiles, though, it shouldn't 
matter, as old mail will just be rotated out, right?

Why such an odd number as 18009 for maxfiles? And why should only 4k of 
the message be processed? I set it to collect ~50k so it would get all 
of every message. How is that worse?

David


Kevin wrote:
> David wrote:
>   > This is the contents of rebuildrun.txt
>   
>> Maxbytes: 50000
>> Maxfiles: 10000
>>     
>
> your 'Maxbytes' is WAY too high, it should be "4096".
> your 'Maxfiles' is too low, it should be "18009".
>
>   
>> I believe the norm should be closer to 1, but how can I do that? Also, 
>> what is odd is that I get this when I run the rebuildspamdb script
>>     
>
> You need more notspam. You have almost a 3/1 ratio of spam to notspam.
>
> I would really recommend reporting more notspam errors to ASSP and 
> possibly reducing the number of tests that collect spam.
>
> At the very least disable the Bayesian tests until you can fix the 
> corpus, with it that out of whack the problem will only get worse as the 
> Bayesian test will cause even more poisoning thus starting a downward 
> spiral.
>
>   
>> What do the numbers after the directories mean, and why are they 
>> different in the script output and the txt file?
>>     
>
> The number is the count of files in the directory.
> There is a bug in the screen output of the rebuild script that causes 
> the number to be different, rather than being a unique number for each 
> folder like the 'rebuildrun.txt' file, the screen output is actually a 
> running count of the total files read.
>
> Kevin
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Assp-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/assp-user
>   

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Assp-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-user

Reply via email to