> From: "Theo Van Dinter" <[EMAIL PROTECTED]>
>
> There's a few things here.
>
> First, the body-mime headers aren't typically visible to the user via MUA,
> so they're not included in the data that the standard rules run against.

Normal headers in their full glory also aren't typically visible to the user
in most MUAs, only processed versions of select headers such as From and
Subject.  You typically have to go digging to fine headers like Received,
and yet they are considered one of the more important spam indicators.  Lack
of normal visibility in an MUA is a poor justification for excluding
information in the mail from a spam classifier.


> Second, viruses and worms aren't spam, and body-mime headers have not
> historically provided enough useful anti-spam information to have a
> special ruletype to look at them.

You are assuming here that the only use for examining mime headers are to
classify virui and worms.  While that is the origin of this thread, I find
that rather irrelevent.  Mime headers are some of the more unique spam signs
one can spot from even a casual perusal of a few hundred spams, and in many
cases can be used (often inconjunction with header information) to determine
that a mail came from a particular spam tool, and is therefore in all
probability spam, without even looking at the body content.

I suspect that they have been 'historically useless' primarily because they
have not been accessible to rules, and/or nobody has considered them as
interesting as body content and received headers.


> Third, it's trivial to write a plugin to go through them if you really
need
> them for something.  Something ala:
>
>   foreach my $p ($self->{msg}->find_parts([EMAIL PROTECTED]/octet-stream@))
{
>     my ($ctype, $boundary, $charset, $name) =
Mail::SpamAssassin::Util::parse_content_type($p->
>get_header("content-type"));
>     $name ||= '';
>     $name = lc $name;
>
>     return 1 if ($name =~ /\.(?:scr|bat|com|pif)$/);
>   }
>   return 0;

Well, its trivial if your name is Theo or Justin or Daniel and you work with
SA code 10 hours a day every day.  In that case you probably know more Perl
than Larry Wall does, and you also know every routine inside of SA and know
exactly wiat it is to be used for.

For those of us that spend 16 hours a day building OSes or graphics
applications or accounting packages, and speak C++ or Java or some language
other than Perl, and merely hate spam enough to want to do something about
it, it is hardly a trivial undertaking to spend months learning a language
of surpassing crypticality, and then learn the undocumented (or otherwise)
innards of a major program, simply to be able to write a few simple rules.


Hopefully someone that has the time in their lives to learn both Perl and
the innards of SA will write some simple plugins to expose those parts of
the mail message that the rest of us can then use to write rules against
those parts.  I know I'd like to write a half dozen rules against various
mime headers.  But I have only the vaguest guess about what that code
snippit above may be doing, and I suspect strongly that it isn't by itself
sufficient to be an actual plugin.  I don't think I have the spare year in
my life to figure out how to make something like that work just so I can
write a couple of rules to catch things with encodings of BitBitNum.

        Loren

Reply via email to