I agree that this isn't going to be the best approach. Detecting ham
is simply more difficult:
1. New types of ham emerge more often than new types of spam. Spammers
generally stick to tried-and-true subjects while ham is all over the
place.
2. Ham is more personalized than spam. Everyone gets
Duncan Michael,
Thank you for the careful thought and detailed input. Please read my
Protype Config email of yesterday afternoon. This is not as it
appears, NOT a weighted ham finding rules approach but rather a non
weighted ham tuned spam finding rules approach. Its unconventional
Tom Allison wrote:
Personally, I think HTML email should be outright discarded from the start.
If you look at this arguement presented by the OP then it reinforces the
idea that most ascii is ham and most html is spam. Therefore, reject
delivery of all html based email. Or to be more
On Monday 12 February 2007 13:27, Kelson wrote:
Tom Allison wrote:
Personally, I think HTML email should be outright discarded from the
start. If you look at this arguement presented by the OP then it
reinforces the idea that most ascii is ham and most html is spam.
Therefore, reject delivery
Gene Heskett wrote:
On Monday 12 February 2007 13:27, Kelson wrote:
Now, if you can come up with another markup language for formatting
email...
[...]
* And you can get all the major email clients to use it for formatted
composition instead of HTML (so end users can still make their
Gene Heskett wrote:
With all due respect, that's 100% BS. MIME was invented to handle the
non-ascii stuff, and does it very well except for M$, who couldn't follow
a std rule with a loaded 44 magnum stuck in Bills ear.
100% BS? So end-users don't like formatting in their messages? Email
--On Monday, February 12, 2007 12:50 PM -0800 Kelson [EMAIL PROTECTED]
wrote:
In other words, what can adequately replace text/html in the
non-plaintext multipart/alternative section such that HTML becomes
irrelevant for legitimate uses? Microsoft Word? PDF? RTF? Any of
those would be
Kelson wrote:
Tom Allison wrote:
Personally, I think HTML email should be outright discarded from the
start.
If you look at this arguement presented by the OP then it reinforces
the idea that most ascii is ham and most html is spam. Therefore,
reject delivery of all html based email. Or to
On Sun, Feb 11, 2007 at 11:10:53PM -0500, Duncan Findlay wrote:
I've read most of the e-mails on this topic and I think the underlying
problem is that this method relies on knowing exactly which profiles
(i.e. combinations of rules) valid ham can hit.
After re-reading your message with your
On Mon, Feb 12, 2007 at 11:00:06PM -0500, Duncan Findlay wrote:
On Sun, Feb 11, 2007 at 11:10:53PM -0500, Duncan Findlay wrote:
I've read most of the e-mails on this topic and I think the underlying
problem is that this method relies on knowing exactly which profiles
(i.e. combinations of
Giampaolo Tomassoni wrote:
From: Miles Fidelman [mailto:[EMAIL PROTECTED]
Dan wrote:
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier to use. First, the theory:
NEW
On Saturday 10 February 2007, Dan wrote:
On Feb 10, 2007, at 14:38, Mathieu Bouchard wrote:
How do you ever find FPs if you have so many TP to sort through
that it's not even worth sorting through FP+TP to find the FP ?
IMHO, that'd be why we assume that mails are ham rather than assume
Long-time SpamAssassin users with a good memory might recall back in
SpamAssassin 2.4x, we included quite a few ham-targeting rules, such as
was this sent using User-Agent: Mozilla?, is this formatted like a
reply to a previous message?, does it include headers from a mailing
list? and is it
On Feb 10, 2007, at 3:19 PM, Giampaolo Tomassoni wrote:
From: Tom Allison [mailto:[EMAIL PROTECTED]
Personally, I think HTML email should be outright discarded from
the start.
If you look at this arguement presented by the OP then it
reinforces the idea
that most ascii is ham and most html is
From: tom [mailto:[EMAIL PROTECTED]
On Feb 10, 2007, at 3:19 PM, Giampaolo Tomassoni wrote:
From: Tom Allison [mailto:[EMAIL PROTECTED]
Personally, I think HTML email should be outright discarded from
the start.
If you look at this arguement presented by the OP then it
reinforces
On Sat, Feb 10, 2007 at 08:22:41PM +, Nigel Frankcom wrote:
What do Theo, Matt Co have to say? They've been doing this a lot
longer than us.
Unless I'm missing something, this approach is the standard block
everything except for what we explicitly want to receive. Which is
great, if you
Subject: Re: A New Approach: Find the Ham
On Sat, 10 Feb 2007 15:14:56 -0500, Miles Fidelman
[EMAIL PROTECTED] wrote:
Dan wrote:
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built
On 10 Feb 2007 at 11:43, Dan wrote:
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier to use. First, the theory:
[...]
NEW SITUATION
Ham is now the tiniest minority of
Hey Dan,
I've read most of the e-mails on this topic and I think the underlying
problem is that this method relies on knowing exactly which profiles
(i.e. combinations of rules) valid ham can hit.
I see a number of problems:
- How do we actually generate the profiles that are to be considered
From: Dan [mailto:[EMAIL PROTECTED]
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier to use. First, the theory:
SITUATION
In the beginning, all email was ham.
On Sat, 10 Feb 2007 20:52:17 +0100, Giampaolo Tomassoni
[EMAIL PROTECTED] wrote:
From: Dan [mailto:[EMAIL PROTECTED]
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier
CHALLENGE
All filtering software is written to score for results that equal
spam - catch the bad
SOLUTION
Make filtering software score for results that equal ham - uncatch
the good.
Your thoughts?
How can this method spend less time and energy? Aren't you going to build a
mirrored
Dan wrote:
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier to use. First, the theory:
NEW ASSUMPTION
All messages are spam unless x,y,z score says they're ham.
NEW
From: Tom Allison [mailto:[EMAIL PROTECTED]
CHALLENGE
All filtering software is written to score for results that equal
spam - catch the bad
SOLUTION
Make filtering software score for results that equal ham - uncatch
the good.
Your thoughts?
How can this method spend
From: Tom Allison [mailto:[EMAIL PROTECTED]
CHALLENGE
All filtering software is written to score for results that equal
spam - catch the bad
SOLUTION
Make filtering software score for results that equal ham - uncatch
the good.
Your thoughts?
How can this method spend
One consideration is that spam getting through is never more than an
annoyance. Ham getting caught can be a big problem. So any kind of deny
by default system has to deal with how to respond to people sending you
mail that gets trapped and provide a way for the sender to get
approval. How
This would be easier to filter.
It would also be more adaptive to a statistical approach than a regex
approach.
Personally, I think HTML email should be outright discarded from the
start.
If you look at this arguement presented by the OP then it reinforces
the idea that most ascii is ham
On Sat, 10 Feb 2007 15:14:56 -0500, Miles Fidelman
[EMAIL PROTECTED] wrote:
Dan wrote:
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier to use. First, the theory:
NEW
From: Miles Fidelman [mailto:[EMAIL PROTECTED]
Dan wrote:
I've developed a new approach to scoring that I want to 1) share with
everyone and 2) make into a working system thats as accurate as what
I've already built, but easier to use. First, the theory:
NEW ASSUMPTION
All
Clarifications:
1) I'm not talking about generating new rules. Rules stay the same.
I'm describing a new scoring process only.
2) This would not be a replacement to SA, but an improvement. Just a
new way to process results already generated by SA. Ideally, this
would be a replacement
Is that the same as whitelisting, maybe I do not understand, but a very
rigorous approach would
be a whitelist methodology which, once a new account is created, they
send email to everyone they
want to communicate with, and it 'autowhitelists' those addresses, so
you can only receive from those
On Feb 10, 2007, at 12:14, Miles Fidelman wrote:
Dan wrote:
I've developed a new approach to scoring that I want to 1) share
with everyone and 2) make into a working system thats as accurate
as what I've already built, but easier to use. First, the theory:
NEW ASSUMPTION
All messages are
NEW SITUATION
Ham is now the tiniest minority of all email.
NEW ASSUMPTION
All messages are spam unless x,y,z score says they're ham.
NEW APPROACH
Block everything, then create rules to not catch what you do want.
ie, build tests that target the spam (keeping all the tests you've
On Sat, 10 Feb 2007, Dan wrote:
With Find the Ham, whitelisting is almost obsolete. When you find an FP,
How do you ever find FPs if you have so many TP to sort through that it's
not even worth sorting through FP+TP to find the FP ? IMHO, that'd be why
we assume that mails are ham rather
On Feb 10, 2007, at 14:38, Mathieu Bouchard wrote:
How do you ever find FPs if you have so many TP to sort through
that it's not even worth sorting through FP+TP to find the FP ?
IMHO, that'd be why we assume that mails are ham rather than assume
that they are spam.
I haven't found FP
Good point, but will cause trouble UNLESS we find a way to recognize
ham 100%. And it must me exactly 100% (99% won't be enough).
As other users said, with current system, if we can filter 70-80 of the
spam, remaining 20-30% will only be an annoyance, but ham will be delivered.
But with the
36 matches
Mail list logo