http://bugzilla.spamassassin.org/show_bug.cgi?id=2878
Summary: Identify when plain text and HTML are different in multipart/alternative Product: Spamassassin Version: 2.61 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P5 Component: Rules AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Recently, I have received a lot of spam with the multipart/alternative MIME type. There are some random words in the plain/text version and some other random words in the HTML version, the information is mainly contained in an image which is linked. RFC 1521 (IIRC) says that the contents of parts in multipart/alternative should be essentially the same, so it should be a pretty good rule if it was possible to compare the contents of the plain text and HTML versions to see if the same words can be found in each. Comments can be ignored, and the words can be compared. I don't know what kind of algorithms will be used, but surely something exists for the purpose of comparing texts...? I'm getting the same spam as in bug #2875, but I'll include a bit more of the most relevant stuff: ----ALT--TCEF13321957421304 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit swab companionway bagpipe elephant cucumber regal birmingham shuck soothe plethora arrogate phenolic lieu zombie cherub denote leland urania basket blight fairfield eat conqueror imposture ----ALT--TCEF13321957421304 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 8bit <HTML><HEAD> <BODY> <p>Fr</battlefront>ee Ca</courtyard>bleTV!N</histamine>o mo</bovine>re p</consumptive>ay!&</p> <a href="http://www.2004hosting.net/cable/"> <img border="0" src="http://www.2004hosting.net/fiter3.jpg"></a> nature borealis chastity cow debra checkpoint ascribe deferring tabulate marketeer lob eaton sophistry blockade eyepiece benthic exhibit oatmeal bacon keen buckwheat champagne turtleback intoxicant defunct crewcut <BR> Also quite common, and even easier to catch are cases where text/plain is empty. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.