http://bugzilla.spamassassin.org/show_bug.cgi?id=3847
------- Additional Comments From [EMAIL PROTECTED] 2004-09-30 18:55 -------
Subject: Re: Consider removing RFCI tests from SA 3.0
> We might use the RFCI DB in a
> way which is not recommended by the maintainers.
One could make the argument that nearly everything SA uses is being used in
a way not intended by the generators of that data. After all, the purpose
of spam is to bloat recipient's mailboxes, and to hopefully be read by
suckers. Using the charactreristics of that spam to PREVENT it reaching
mailboxes and being read is clearly contrary to the original purpose of the
spammers generating the characteristics in the first place.
Thus, if someone complains that "I created
device/characteristic/list/website X for some purpose other than blocking
spam, and I'm unhappy with you using it for YOUR purposes just because it
happens to work". I think the only correct response is "thank you for your
input." Period. I think we could safely conclude that the spammers are
unhappy with us using the stuff they generate to block spam. Should we stop
using it and let the spam through?
However, Fred also has a valid point, and it gives me as much concern as it
does Fred. That is the whole concept of arguments about "I don't think we
should allow characteristic/device/domain/letter-combination X into the SA
rules, because that letter combination isn't Official with some organization
I don't know in some other country." I don't give a damn if it is Official,
offal, or foie-gras. There are only TWO valid questions to ask: 1) does it
help reduce spam? 2) does it slow things down more than it is worth?
The answer to question 1 is clear-cut and measurable, although it may change
with time. It is not open to interpretation, only physical measurement. It
has only two possible answers: yes or no. If the answer happens to be Yes,
then one can discuss the second question. That one becomes a personal
interpretation, and is also one that cannot be answered as a general case,
but only in specific measured scenarios, and then extrapolated.
Unfortunately there are no avaiable statistics to determine how reliable
your extrapolation is going to be. This would cause me to error on the side
of including things that showed significant value in answer to the first
question, even if I thought that maybe the test was too slow. Someone else
might not think that.
> Maybe the DNS_FROM_RFC_POST should be limited to some upper border (the
> "reliability" which was discussed on the lists before) like 1.0 or 0.5
though
> to ensure that it won't be scored too high by the algorithm accidently.
On a completely different subject, this does bring up an interesting point.
The 'reliability' value, however it might be specificed, probably covers two
domains:
There is probably a score range constraint, such as 'no higher than" or "no
lower than" on the assigned score.
There is probably a score preference or multiplier, to give the score a
stronger or weaker score than it might otherwise receive.
Loren
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.