> On Wed, Oct 27, 2004 at 09:35:11AM -0400, Keith Hackworth wrote:
>> > I'm guess you want PMS::get_decoded_stripped_body_text_array().
>>
>> Thanks, Theo - this may work for html only messages, which might be good
>> enough for what I'm trying to do.  I need just the HTML version of the
>> email.  No attachments, just the HTML body.  If the 1st part if
>> multipart,
>> I need the 1st html part.
>
> If you want to limit what you're looking at in that way, you'd need to
> access
> the Message object directly and use find_parts to grab just the first
> matching
> part you're interested in.  The PMS functions work on all text/* parts,
> and
> aren't limited to HTML.
>
>> Here's what I'm trying to do:
>> I'm trying to find invalid html tags and if there's too many, bump the
>> sa
>> score up a bit.  I noticed a bunch of messages come in with obfu like
>> this
>> "v-wo<notatag>rd" in the body of the html message, which shows up as
>> "v-word" on a normal webmail or outlook email client.  I want to see how
>> many "notatag"s we're getting in a message.  I got the code on how to do
>> it and it works fine, but it's just WAY too slow using
>> PMS::get_message().
>
> Yeah, that'll get you a bunch of stuff you really don't care about.
> get_decoded_stripped... is also not the right thing, since it will have
> stripped all the HTML tags.  I'd try get_decoded_body_text_array(),
> or since you're doing code anyway, just use find_parts and grab the
> [EMAIL PROTECTED]/[EMAIL PROTECTED] parts of the message.  You can then 
> easily call decode()
> on them (object function) and get the raw HTML out.
>
> Just curious though, why limit yourself to invalid html tags?  Why not
> just
> target the html-tag-in-middle-of-word behavior?   and isn't this the same
> idea
> as the backhair code?
>
> --
> Randomly Generated Tagline:
> "Exactly what it should've been, give people what they expect.  The third
>  one can be clever."               - John Hughes about Home Alone 2
>

Wow!  I guess if I RTFM a little better, I'd save myself a lot of trouble.
 I didn't realize backhair did this already.

On to [EMAIL PROTECTED] [EMAIL PROTECTED], which catches "[EMAIL PROTECTED] 
l|k3 th!s" in the subject.  Yes - I
know chicken pox does this already, but I have many custom rules built on
my server for this one and it seems to be much more accurate.

Thanks!
Keith

Reply via email to