Hi Sudip, On Fri, 13 Sep 2002 11:35:26 +0545, you wrote: > > HTML is put into emails in two main ways (that I know of), inline, > > and via attachment. You can filter for inline by searching the > > kludges for "Content-Type: text/html"...you cannot search for it if > > it is done via attachment as TB! doesn't support filtering on > > attachment headers. > > How do I know if the HTML portion is embedded inline or as attachment?
You have to look at the message source itself. If the top headers only contain Content-Type: plain/html then it is inline... if it contains multipart/alternative or similar, then it is an attachment, although some mail clients sometimes hide that fact. > ,----- [ Begin Quote ] > | Subject: Message Subject > | Reply-To: [EMAIL PROTECTED] > | Content-Type: multipart/alternative; This line is the hint... it tells you there are multiple parts.. > | boundary="part1_153.13f2cdab.2ab29e4b_boundary" This tells you where each part starts... [1] > | --part1_153.13f2cdab.2ab29e4b_boundary Here is the start of the first part of the message... > | Content-Type: text/plain; charset="ISO-8859-1" This line tells the email client that the following text should be read as plain text with the character set of ISO-8859-1. > | Content-Transfer-Encoding: quoted-printable Just the encoding type... nothing important here. > | [Message Text] I think you know what this bit is ;) > | --part1_153.13f2cdab.2ab29e4b_boundary This is the start of the second part... the boundaries are the same so the mail can work out where each attachment starts and stops. > | Content-Type: text/html; charset=ISO-8859-1 This is the line that tells the email client the following text should be read via an html engine of some type. > | Content-Transfer-Encoding: quoted-printable Encoding type again... not of any use in this example. > | <HTML> Your HTML version of the above message. > If this isn't a part of kludges, what is it part of? Message body? But > the filter setup to detect this in the text fails as well. Notice the bit I put in [1] up the top. That is the end of the kludges. The body starts right after a two CRLF from the end of the last header. The remainder is body. Now what I have noticed (Marck, Allie, or somebody that knows the full story may be able to tell you), is that filters will only search text/plain if there are attachments, and I think only the first one it comes to. This I guess is to stop it scanning through large attachements as they are also part of the body, and can also reduce false positives. Try putting (if you can) a mail filter on the mail server (via procmail is easiest) that filters the body for the word "sex". You'll get a LOT of false positives caused by attachments because it appears a few times in the apparent random base64 encoding. I hope that kind of makes sense... if not I can try explaining it a little clearer, or maybe one of the others that may be able to put it in clearer terms may be able to help ;) -- Jonathan Angliss ([EMAIL PROTECTED]) ________________________________________________ Current version is 1.61 | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html

