After reading through this thread, I decided to formally define my own
definition of spam (since others are basically trying to do that,
etc.).
A) It does not matter, one way or the other, if the message is
automatically generated or hand generated. If you don't want to wear
your fingers down to the bone typing a message to me that I wont even
accept or read, then don't type it. If I don't want the message
content in the first place, then I have no sympathy for the fingers
that typed it.
B) It doesn't matter to me if it was 1 message or 1 million. I am as
annoyed by the spam's I receive once as I am by the spams I receive
over and over again. (though, see G about repetition)
C) If the message has a forged sender, and it is not a joke from a
known friend of mine, or a legitimate "whistle blower" type message for
a serious issue which needs an anonymous sender to protect them from
reprisals for the whistle blowing, it is spam. (for mailing lists
which alter the sender information to be the list itself, I do not
consider this to be a forgery) (in the case of a whistle blower, the
forgery must be to make it anonymous, instead of making it seem like it
came from someone else)
D) If the message has obscured the recipients from the headers, for any
reason or purpose other than to simplify the recipients of a formal
mailing list, it is spam. (so, if the actual recipients aren't listed
in the To/CC headers, then a mailing list to which that recipient
belongs must be in the To/CC headers, and the message must have
legitimately been sent to/through that mailing list)
(For all of you people who like to send a "undisclosed recipients"
message to all of your friends: yes, I'm calling you spammers, and I am
unapologetic about it. If you don't like it, don't send me email.)
E) If the message attempts to falsify any sort of prior relationship
between myself and the sender, it is spam.
F) If I ask you stop sending me messages, and you continue to send me
messages through any means other than physical/snail mail from your
lawyer to my lawyer, your continued messages are both spam and
harassment.
G) If you send me the same general message more than 3 times, and I did
not request that you repeat the message, it becomes spam regardless of
what it may have been in the first place (historical note: this is the
closest definition to the original definition of spam on the net, which
had more to do with volume and repetition than content).
H) For this section, I shall define a new header: X-SpamOrHam
(the purpose of this section is to illustrate that "it is spam if
the messages true purpose and content is in any way obscured and not
plainly announced", but I am also announcing that I demand that such
purpose/content be announced, and announced in a particular manner that
suits me, as follows)
If a message fails any of these criteria, or falsifies any of these
answers, it is spam (or, in any of these cases, if the initial
condition is true, but the header doesn't exist):
0) If the message comes from a business, and it is in any
way speaking for a business, or on behalf of the products
or services of a business (as opposed to being a friend
of mine emailing me from their work account, about non-
business matters), even if the sending business is not
the same as the business being discussed, and the header
field does not match: /.* business.*/i
1) If it is an advertisement, business opportunity, or other
attempt to get money from me, and the content of that header
field does not match: /.* advertisement.*/i
2) If it is a business announcement from a company for which
I have an existing relationship for which I am the customer,
and the header field does not match: /.* customer.*/i
3) If it is a business announcement from a company for which
I have an existing relationship not covered by #2, and the
header does not match: /.* partner.*/i
4) If it is a business announcement from a company for which
I have no existing relationship, and the header does not
match: /.* unsolicited.*/i
5) If it is a mailing list, which I have performed a double-
opt-in (ie. a _REAL_ opt-in, not a fake opt-in), and the
header field does not match: /.* confirmed-list.*/i
6) If it is a mailing list, where only a signle opt-in has
been performed (ie. a fake-opt-in), and the header field
does not match: /.* unconfirmed-list.*/i
7) If it is a mailing list where I have not performed any
opt-in at all, and the header field does not match:
/.* forced-list.*/i
8) If it is a message whose recipients come from a
purchased list, and the header does not match:
/.* purchased-recipient-list.*/i
9) If the message is an attempt to give me free stuff, or
free money, and you do not personally know me, and the
header field does not match: /.* give-away.*/i
10) if the sender, or the entity for which the message is
being sent, does not know me personally, and the
header field does not match: /.* i-dont-know-you.*/i
9) I reserve the right to modify this list over time.
Now, am I saying that "I expect my incoming email to conform to the
above"? No, I'm saying that that's how I logically/descriptively
define spam. It's not that it's an advertisement, it's that the
message tries to trick me into reading it by obscuring its source,
and/or trying to con me into thinking it's something other than it is,
and by not clearly identifying itself for what it is.
You see, if all advertisements clearly said "X-SpamOrHam:
advertisement", then I could easily reject all of those messages
(either at SMTP time, or in my mail filters, or whatever). I wouldn't
be bothered by them in the slightest, and I wouldn't even consider them
to be a problem. I wouldn't read them, but I wouldn't call them "a
problem" or "a plague".
What makes spam a problem, to me, is that I can't easily, plainly, and
programatically identify all advertisements/etc. as exactly being that.
The senders are trying to obfuscate, mislead, and con the recipient
into accepting and reading the message. They try to do things which
prevent a program from easily identifying the content of the message.
They are trying to keep from being easily identified. That is what
makes a message a problem to me: that its true purpose, true source,
and true destination is being obscured. And that is what makes a
message "spam" to me: it represents a disruptive problem in my email.
If none of these things happen, I wont call the message "spam".
Spam Assassin helps us perform that identification, but it is not an
easy, plain, nor precise process (otherwise Spam Assassin would be a
much smaller and faster program without the need for complex things
like a genetic algorithm and a bayesian learning database). And, it is
ultimately there to overcome the lack of honesty and clarity on the
part of spammers. If they were honest, clear, and forthright then we
wouldn't need Spam Assassin.
I don't think anyone could write a program, short of a true natural
language and AI program, which could perform a true implementation of
my rules. Spam Assassin comes in 2nd because it attempts to identify
my preconditions (though, without specifically concluding any one of
those preconditions), and then we all assume that the honesty which is
embodied by the header is not present.
So, is Spam "UBE", or "UCE", or "any Commercial Email", or ... ?
To me, those acronyms/explanations completely miss the point and
problem. To me, Spam is basically "obfuscated email". Because if it
wasn't obfuscated, I could easily filter it, and then not care.