After reading through this thread, I decided to formally define my own definition of spam (since others are basically trying to do that, etc.).

A) It does not matter, one way or the other, if the message is automatically generated or hand generated. If you don't want to wear your fingers down to the bone typing a message to me that I wont even accept or read, then don't type it. If I don't want the message content in the first place, then I have no sympathy for the fingers that typed it.

B) It doesn't matter to me if it was 1 message or 1 million. I am as annoyed by the spam's I receive once as I am by the spams I receive over and over again. (though, see G about repetition)

C) If the message has a forged sender, and it is not a joke from a known friend of mine, or a legitimate "whistle blower" type message for a serious issue which needs an anonymous sender to protect them from reprisals for the whistle blowing, it is spam. (for mailing lists which alter the sender information to be the list itself, I do not consider this to be a forgery) (in the case of a whistle blower, the forgery must be to make it anonymous, instead of making it seem like it came from someone else)

D) If the message has obscured the recipients from the headers, for any reason or purpose other than to simplify the recipients of a formal mailing list, it is spam. (so, if the actual recipients aren't listed in the To/CC headers, then a mailing list to which that recipient belongs must be in the To/CC headers, and the message must have legitimately been sent to/through that mailing list)

(For all of you people who like to send a "undisclosed recipients" message to all of your friends: yes, I'm calling you spammers, and I am unapologetic about it. If you don't like it, don't send me email.)

E) If the message attempts to falsify any sort of prior relationship between myself and the sender, it is spam.

F) If I ask you stop sending me messages, and you continue to send me messages through any means other than physical/snail mail from your lawyer to my lawyer, your continued messages are both spam and harassment.

G) If you send me the same general message more than 3 times, and I did not request that you repeat the message, it becomes spam regardless of what it may have been in the first place (historical note: this is the closest definition to the original definition of spam on the net, which had more to do with volume and repetition than content).

H) For this section, I shall define a new header: X-SpamOrHam
(the purpose of this section is to illustrate that "it is spam if the messages true purpose and content is in any way obscured and not plainly announced", but I am also announcing that I demand that such purpose/content be announced, and announced in a particular manner that suits me, as follows)

If a message fails any of these criteria, or falsifies any of these answers, it is spam (or, in any of these cases, if the initial condition is true, but the header doesn't exist):

   0) If the message comes from a business, and it is in any
      way speaking for a business, or on behalf of the products
      or services of a business (as opposed to being a friend
      of mine emailing me from their work account, about non-
      business matters), even if the sending business is not
      the same as the business being discussed, and the header
      field does not match: /.* business.*/i
   1) If it is an advertisement, business opportunity, or other
      attempt to get money from me, and the content of that header
      field does not match: /.* advertisement.*/i
   2) If it is a business announcement from a company for which
      I have an existing relationship for which I am the customer,
      and the header field does not match: /.* customer.*/i
   3) If it is a business announcement from a company for which
      I have an existing relationship not covered by #2, and the
      header does not match: /.* partner.*/i
   4) If it is a business announcement from a company for which
      I have no existing relationship, and the header does not
      match: /.* unsolicited.*/i
   5) If it is a mailing list, which I have performed a double-
      opt-in (ie. a _REAL_ opt-in, not a fake opt-in), and the
      header field does not match: /.* confirmed-list.*/i
   6) If it is a mailing list, where only a signle opt-in has
      been performed (ie. a fake-opt-in), and the header field
      does not match: /.* unconfirmed-list.*/i
   7) If it is a mailing list where I have not performed any
      opt-in at all, and the header field does not match:
      /.* forced-list.*/i
   8) If it is a message whose recipients come from a
      purchased list, and the header does not match:
      /.* purchased-recipient-list.*/i
   9) If the message is an attempt to give me free stuff, or
      free money, and you do not personally know me, and the
      header field does not match: /.* give-away.*/i
  10) if the sender, or the entity for which the message is
      being sent, does not know me personally, and the
      header field does not match: /.* i-dont-know-you.*/i
   9) I reserve the right to modify this list over time.


Now, am I saying that "I expect my incoming email to conform to the above"? No, I'm saying that that's how I logically/descriptively define spam. It's not that it's an advertisement, it's that the message tries to trick me into reading it by obscuring its source, and/or trying to con me into thinking it's something other than it is, and by not clearly identifying itself for what it is.

You see, if all advertisements clearly said "X-SpamOrHam: advertisement", then I could easily reject all of those messages (either at SMTP time, or in my mail filters, or whatever). I wouldn't be bothered by them in the slightest, and I wouldn't even consider them to be a problem. I wouldn't read them, but I wouldn't call them "a problem" or "a plague".

What makes spam a problem, to me, is that I can't easily, plainly, and programatically identify all advertisements/etc. as exactly being that. The senders are trying to obfuscate, mislead, and con the recipient into accepting and reading the message. They try to do things which prevent a program from easily identifying the content of the message. They are trying to keep from being easily identified. That is what makes a message a problem to me: that its true purpose, true source, and true destination is being obscured. And that is what makes a message "spam" to me: it represents a disruptive problem in my email. If none of these things happen, I wont call the message "spam".

Spam Assassin helps us perform that identification, but it is not an easy, plain, nor precise process (otherwise Spam Assassin would be a much smaller and faster program without the need for complex things like a genetic algorithm and a bayesian learning database). And, it is ultimately there to overcome the lack of honesty and clarity on the part of spammers. If they were honest, clear, and forthright then we wouldn't need Spam Assassin.

I don't think anyone could write a program, short of a true natural language and AI program, which could perform a true implementation of my rules. Spam Assassin comes in 2nd because it attempts to identify my preconditions (though, without specifically concluding any one of those preconditions), and then we all assume that the honesty which is embodied by the header is not present.


So, is Spam "UBE", or "UCE", or "any Commercial Email", or ... ?

To me, those acronyms/explanations completely miss the point and problem. To me, Spam is basically "obfuscated email". Because if it wasn't obfuscated, I could easily filter it, and then not care.

Reply via email to