i had been using version 0.3 of spambayes for a long time (XP/outlook express) and it was working fairly well.  i recently upgraded to 1.0.1, and now i get a ton of false positives (including the confirmation and welcome messages from this mailing list !!)  probably close to 20% of my valid emails are being marked as spam.
 
does anyone have any ideas about how to fix this problem?  it's worse now than if i had no filter, because i have to comb through every spam looking for non-spams!  please help!
 
more details:
 
Summary:
 
POP3 proxy running on 110, proxying to xxx.com:110.
Active POP3 conversations: 0.
POP3 conversations this session: 53.
Emails classified this session: 163 spam, 19 ham, 50 unsure.
Total emails trained: Spam: 1299 Ham: 3644
 
Configuration:
 
(not using SMTP proxy)
 
Storage:
cache messages: Yes
Suppress caching: No
Maximum size of cached messages 0
Ham cutoff 0.2
Spam cutoff 0.9
 
Clues:
 
here is the info from the Message Clues page (the "Clues" link) from one of the false positives.  it is marked as 99.7% spam probability!  and it appears that it is counting the "spam," string in the 'subject' and 'to' fields as part of the reason to consider it spam (?), even though those were added by Spambayes.
 

Spam probability: 99.70% (0.997016765123).

Clues for: spam,Welcome to the "Spambayes" mailing list (93)
 
Word Probability Times in ham Times in spam
*H* 1.7e-005 - -
*S* 0.99405 - -
sender:no real name:2**0 0.004644 48 0
remind 0.034884 6 0
password, 0.050562 4 0
xxxxxxxx 0.091837 2 0
adjustments 0.155172 1 0
etc.), 0.155172 1 0
list! 0.155172 1 0
subject:mailing 0.155172 1 0
url:listinfo 0.155172 1 0
url:mailman 0.155172 1 0
subject: " 0.157739 61 4
subject:list 0.177272 53 4
reminder 0.199865 12 1
is: 0.248435 26 3
url:mail 0.248925 9 1
options 0.307251 13 2
post 0.320238 42 7
from 0.607329 466 257
know 0.612518 172 97
there 0.617636 144 83
month, 0.617901 12 7
message 0.622114 235 138
skip:i 10 0.625768 176 105
url:org 0.634305 71 44
how 0.637903 132 83
unsubscribe 0.65723 32 22
once 0.657664 48 33
to:addr:lunchclub.net 0.658645 955 657
page 0.661866 60 42
header:Errors-To:1 0.667086 4 3
header:Return-Path:1 0.667277 1137 813
header:Subject:1 0.673037 1135 833
header:From:1 0.675923 1139 847
header:To:1 0.67644 1139 849
header:Date:1 0.676889 1138 850
about 0.679786 161 122
with 0.693328 408 329
get 0.703141 239 202
subject: 0.707308 781 673
email 0.709556 203 177
change 0.710765 34 30
to:2**1 0.710965 18 16
that 0.723317 312 291
(including 0.724572 4 4
make 0.726513 99 94
button 0.72863 6 6
and 0.728874 534 512
instructions 0.732838 12 12
charset:us-ascii 0.738767 212 214
back 0.739152 71 72
the 0.742406 537 552
include 0.742591 25 26
current 0.745397 39 41
want 0.748789 107 114
subject 0.750232 38 41
although 0.754605 7 8
will 0.756049 245 271
header:Message-ID:1 0.763062 484 556
changing 0.764858 4 5
also 0.766808 85 100
visit 0.769583 61 73
account 0.77246 41 50
can 0.775256 216 266
must 0.776564 36 45
you. 0.780946 76 97
header:MIME-Version:1 0.781444 512 653
this 0.784279 333 432
via 0.785055 35 46
your 0.786259 368 483
mailing 0.786413 21 28
you 0.786524 423 556
at: 0.795161 17 24
subject:the 0.80783 25 38
information 0.812949 90 140
switch 0.824395 2 4
subscription 0.832119 3 6
from:no real name:2**0 0.835566 78 142
every 0.837444 35 65
disable 0.844828 0 1
unsubscribe. 0.853973 1 3
such 0.854951 21 45
to:no real name:2**1 0.861546 14 32
proto:http 0.861931 333 742
url:net 0.870269 51 123
ever 0.874069 23 58
passwords 0.908163 0 2
subject:, 0.951673 1 11
body 0.954103 2 19
subject:spam 0.965874 1 16
digest 0.97619 0 9
to:addr:spam 0.97619 0 9
 

Return Home or classify another:

_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to