RE: [Declude.JunkMail] Effectiveness

Madscientist Sun, 06 Oct 2002 10:55:36 -0700

Appologies in advance for taking so much bandwidth here.
This is somewhat off-topic for the Declude list.


]_M,
]
]I'm still not sure I'm clear, what do you mean by captured and
]what constitutes a "reporting system".  92% is remarkable,

Some systems that are using Message Sniffer report their log files to us so
that we can add them to our database and develop statistics. In that
context, the total of messages tagged by Sniffer over the total of messages
passed through Sniffer on those systems is averaging more than 70%. This
means that 70% of all email traffic on average is spam - false positives not
withstanding.

]relative to other stats I've seen, what rate/amount of false
]positives does that same configuration generate?  Are you
]explaining the way a system behaves or just how your numbers are
]calculated?

On our local systems we have a very low apparent false positive rate...
something significantly less than .1%, however we do not have any sound
scientific measurements in place for that rate - only anecdotal evidence in
that we do not get any reports of false positives beyond this rate. Also,
our system tends to match our filtering profile very well.

The 92% number means that if we take all of our spamtrap messages and pass
them through Message Sniffer it will tag 92% of those messages on average.
The messages that don't get tagged get added to our rule generation process
along with reports from our user base.

Typically the 8% not captured is made up of multiple copies of new spam in
it's early phases of deployment. We have been increasing our update rates to
compensate as our user base grows to support the extra effort.

More aggressive tests could eliminate much of this - but would also increase
the false positive rates on many systems - so we avoid them. For example, we
once had a rule that would capture any numbered web link - unfortunately we
discovered that a few legitimate lists will use numbered web links from time
to time - so now we only filter for abstracted, specific web links with good
results. There are many abstract behaviors and patterns that have been
shelved under similar circumstances. Most of them will be resurected in the
customizable database as an optional add-in. For example, many school
systems could easily use filters like these without generating any false
positives.

Spamtrap sources are designed to generate spam exclusively so that they can
be used for this purpose. It is important to note that these statistics have
no indication toward false positives. As has been pointed out - it is
comparitively easy to capture spam. Doing so while keeping flase positves
low is the real trick. Doing it for diverse systems makes it harder.

We do have a very low reported rate of false positives from our user base.
Typically less than 4 false positives reported for every 6 days of operation
across the entire userbase of more than 100 systems (so far). This reflects
what a "tuned" system's false positive rate can be. A "tuned" system is one
that also takes into account white-list entries as required for the needs of
that local system.

The chief error in this metric is that there is no control on how many false
positives occurr that may not be reported. For example, in the course of a
week, Sniffer would have tagged more than a few million email messages as
Spam. If only 5 messages are reported as false positives, and 5 million or
more messages are tagged in the same period - then the calculated false
positive rate would be 0.0001% (on tenthousandth of a percent!). This is
very close to what we percieve on our local system (anecdotal evidence
only). Although I'd love to be able to advertise that kind of false positive
rate, I find it hard to believe. It's a lot easier to believe that there are
a lot of false positives out there that are just not reported... Of course,
I could be wrong there too (what is a lot?)... The point is, since false
positives are so hard to measure scientifially it is nearly impossible for
anybody to really _know_ in anything other than a static test on qualified
data. And, of course, if we were given the data for the test we would be
able to tune the engine for a 100% accuracy rate (theoretically). The
real-world never reflects these conditions.

In the real world, many of these performance metrics will improve
significantly as we begin to customize the rule base. Currently, the
one-size-fits-all system is designed not to be too strict for small ISP's
while still being strict enough for most small corporate offices. As a
result we must make some trade-offs to keep false positives low.

In the mean time Declude offers all of the additional flexibility required
to tune this model for each local system, and where Sniffer is installed on
other systems (MDaemon, Postfix, etc...) there are also methods to adjust
for most local conditions. Declude is by far the most flexible we've seen so
far unless you happen to be a unix guru writing your own engine. ;-)

]
]Spam traps are something I have yet to create, whats the best way
]to implement them?  Specifically, do you distribute them across
]domains or does it matter and what are the best ways to infect
]them?  Do you use customer domains and if so, what happens if/when
]they leave your ISP/protection?

Implementing good spamtraps is a difficult, time consuming process that
requires both skill and secrecy. If done badly you will recieve messages
that are not unsolicited and you may have spammers abuse your spamtraps and
mail systems to prevent you using them ... all sorts of ugly things can
happen.

For obvious reasons I cannot disclose how we develop our spam traps nor
where they may be.

A few general things I can tell you in response to your questions.

A good spamtrap must "look" to the world like any real user who never
subscribed to any lists.

It is good to have spamtraps distributed across a wide range of domains -
preferably on networks that are not your own.

As for how to infect them... Think about this: You no doubt have an email
address that recieves a significant amount of spam. What is it that you have
done with this account short of subscribing to lists and services?

A couple of ways that email addresses get picked up that are public domain:

* Posts in news groups and otherwise publicly avaialble message boards.
* Email addresses listed as contact info on web sites.

Another that is obvious but not widely discussed is that you can place your
spam trap in the path of a dictionary attack...

It's a lot like fishing... you have to be quite and in the right place.

Hope this helps,
_M

]
]Thanks
]Dan
]
]
]
]On Saturday, October 5, 2002 19:18, Madscientist
]<[EMAIL PROTECTED]> wrote:
]>Perhaps you misunderstood.
]>More than 70% of ALL traffic is captured on average for reporting systems.
]>The base includes non-spam as well. In terms of a percentage of spam,
]>Declude has published statistics consistently showing 85% or more of all
]>incoming spam. On our system it is closer to 92% counting what comes from
]>all spam traps.
]>
]>Hope this clears things up.
]>_M
]>
]>]-----Original Message-----
]>]From: [EMAIL PROTECTED]
]>][mailto:[EMAIL PROTECTED]]On Behalf Of Dan Patnode
]>]Sent: Saturday, October 05, 2002 9:30 PM
]>]To: [EMAIL PROTECTED]
]>]Subject: [Declude.JunkMail] Effectiveness
]>]
]>]
]>]70%?  I believe the spam filter that comes free with Mac OS 10.2
]>]does that well by itself, though I haven't tested it for FPs yet.
]>]Has anyone else tried it?
]>]
]>]Dan
]>]
]>]
]>]On Friday, October 4, 2002 14:02, Madscientist
]>]<[EMAIL PROTECTED]> wrote:
]>]>We have similar circumstances in the email systems that we host. We
]>]>currently trap more than 80% of incoming messages as spam with our
]>]>Message Sniffer software. The average for all reporting systems is
]>]>something just over 70%.
]>]>
]>]>I think Declude w/ Message Sniffer is the way to go if you have an Imail
]>]>server. Of course I am biased - but there are others here who might back
]>]>me up. The demo is free if you want to try it
]>]>(http://www.sortmonster.com).
]>]>
]>]>Biased $0.02
]>]>
]>]>_M
]>]>
]>]>| -----Original Message-----
]>]>| From: [EMAIL PROTECTED]
]>]>| [mailto:[EMAIL PROTECTED]] On Behalf Of Keith Purtell
]>]>| Sent: Friday, October 04, 2002 3:27 PM
]>]>| To: Declude JunkMail (E-mail)
]>]>| Subject: [Declude.JunkMail] Newbie question about baseline
]>]>
]>]><snip>
]>]>
]>]>| However, when I check the server each morning, the spambox
]>]>| has at least 250 new messages, and one Monday I found 1,000.
]>]>| Bear in mind we only have approx 200 employees nationwide and
]>]>| serve a niche market. I've tried to be aggressive about
]>]>| automatically deleting certain incoming mail, especially
]>]>| using rules.ima. Hence the term "baseline" in my subject. Do
]>]>| more experienced postmasters find this much junk on their
]>]>| server and just delete it manually, or do they make better
]>]>| use of the software to automatically delete spam?
]>]>
]>]>---
]>]>[This E-mail was scanned for viruses by Declude Virus
]>]>(http://www.declude.com)]
]>]>
]>]>---
]>]>This E-mail came from the Declude.JunkMail mailing list.  To
]>]>unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
]>]>type "unsubscribe Declude.JunkMail".  The archives can be found
]>]>at http://www.mail-archive.com.
]>]>
]>]
]>]---
]>][This E-mail was scanned for viruses by Declude Virus
]>(http://www.declude.com)]
]>
]>---
]>This E-mail came from the Declude.JunkMail mailing list.  To
]>unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
]>type "unsubscribe Declude.JunkMail".  The archives can be found
]>at http://www.mail-archive.com.
]>
]>---
]>[This E-mail was scanned for viruses by Declude Virus
]>(http://www.declude.com)]
]>
]>---
]>This E-mail came from the Declude.JunkMail mailing list.  To
]>unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
]>type "unsubscribe Declude.JunkMail".  The archives can be found
]>at http://www.mail-archive.com.
]>
]
]---
][This E-mail was scanned for viruses by Declude Virus
(http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

RE: [Declude.JunkMail] Effectiveness

Reply via email to