Jay,
ASSP doesn't have a bias built into it against any particular word
I believe your problem is that your bayesian or HMM database is inaccurate,
and probably too immature to be used if the appearance of a single word
causes a rejection. - or the scoring and thresholds you've set isn't good.
A couple things I would do:
1) Go through the mail log, find the incorrectly rejected messages and
a) Look at the log to see why they were rejected and
b) copy them to the corrected not spam corpus to train ASSP that it was
mistaken
2) Consider assigning a negative score to the word that's causing the
problem in BombRe (a negative score, makes it double negative, so net
result is positive). Even -20 should be enough to let the email slip
through. That's a temporary fix until the corpus gets corrected
-or-
Put this word in no processing for the time being
>
> Esteemed Colleagues:
> I have a recurring problem with ASSP: it discards important incoming
mail that contains the word "tangerine". For example, if a client needs
me to fly somewhere in an emergency, and I reply, "where can I get a
tangerine airplane ticket?" or words to that effect, and the client
replies to my mail, including my mail in his reply, the reply is apt
to be discarded because it contains the word "tangerine", thus:
> 16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> 209.85.220.170 <...@gmail.com> to: j...@m5.chicago.il.us Regex:BombRe 'PB
> 20: for tangerine'
16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> [bombRe] 209.85.220.170 <...@gmail.com> to: j...@m5.chicago.il.us (bombRe
> 'tangerine')
16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> 209.85.220.170 <...@gmail.com> to: j...@m5.chicago.il.us Message-Score:
> added 20 for Regex:BombRe 'PB 20: for tangerine' bombRe: 'tangerine',
> total score for this message is now 21
16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> [bombRe] 209.85.220.170 <...@gmail.com> to: j...@m5.chicago.il.us [spam
> found] (Regex:BombRe 'PB 20: for tangerine' bombRe: 'tangerine')
> [{redacted}] -> /opt/assp/discarded/6726--6440.eml;
16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> 209.85.220.170 <...@gmail.com> to: j...@m5.chicago.il.us [SMTP Error] 554
> 5.7.1 Delivery not authorized, message refused -- . (reason: Regex:BombRe
> 'PB 20: for tangerine' bombRe: 'tangerine')
> Now, bombRe is a good idea, I suppose, but I should be able to control
it. How do I do that? The word "tangerine" does not appear anywhere in
assp.cfg. In files/bombre.txt it appears only as "subject\: tangerineest"
and that would not cause mail to be discarded that contains "tangerine"
somewhere in its body. tangerine (all capitals) also appears in
files/tlds-alpha-by-domain.txt but that too, if I am not mistaken, would
not cause mail to be discarded that contains "tangerine" somewhere in its
body. It also appears several places in files/optRE/blackListedDomains.txt
-- e.g., "(?:quick)?usa|platform|tangerine|now|2u)\.biz" -- but that too,
if I am not mistaken, would not cause mail to be discarded that
contains "tangerine" somewhere in its body.
> It is possible, I suppose, that there is some utterly cryptic regular
expression in some file that matches "tangerine" without actually
containing the string "tangerine", but that would be utterly perverse and
I refuse to believe that the universe is that malicious. And yet, my
e-mails are unquestionably being discarded. How do I forever prevent
that from happening?
> Thank you in advance for any and all replies. One more thing -- if
you do reply, please replace the word "tangerine" with the word
"tangerine", otherwise your reply to me is apt to be discarded. Thank
you again.
>
> Jay F. Shachter
6424 N Whipple St
Chicago IL 60645-4111
(1-773)7613784 landline
(1-410)9964737 GoogleVoice
j...@m5.chicago.il.us
http://m5.chicago.il.us
> "Quidquid latine dictum sit, altum videtur"
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Assp-user mailing list
Assp-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-user