Use caution. The first part of the PDF file is common to many PDF files and coding for that will lead to false positives.


The PDFs we're seeing are essentially boiler plate up to the first 12 lines (or so) of base64 encoded data, then there are some variable segments where the image display size and other parameters are randomized, then you will find another consistent segment which is the header of the encoded image file -- here again that segment will cause false positives for any similar type of image. After the image header you will find the usual obfuscated image in ordinary image spam.


Hope this helps,


_M


On Friday, June 29, 2007, 9:08:02 AM, David wrote:


>

Just a quick question I have noticed that these PDF files all have the following string in the first line is there something I am missing ? I have been using it to catch these spams any thoughts ?

 

BODY                     3             PCRE                      (JVBERi0xLjMgCjEgMCBvYmoKPDwKPj4KZW5kb2JqCjIgMCBvYmo)

 

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Darin Cox

Sent: Thursday, June 28, 2007 12:49 PM

To: declude.junkmail@declude.com

Subject: Re: [Declude.JunkMail] Re: PDF spam detection

 

I was thinking Regex wasn't available since I'm still using 2.0.6, but forgot I could use an external test and the regex available in the windows Findstr command.


Darin.

 

 

----- Original Message ----- 

From: Matt 

To: declude.junkmail@declude.com 

Sent: Thursday, June 28, 2007 12:37 PM

Subject: Re: [Declude.JunkMail] Re: PDF spam detection

 

Here's a piece of RegEx code that should work for blank bodies with a PDF and this particular spammer so long as he is forging Thunderbird:


-+[0-9]+\r\n(?:[a-zA-Z\-]+: [^\r]+\r\n)+(?:\r\n){1,}-+[0-9]+\r\n(?:[a-zA-Z\-]+: [^\r]+\r\n)*Content-Type: application/pdf;


Note that I have not tested this, but the code is in fact fairly simple and it should work.


Matt





Darin Cox wrote: 

So far all that I've seen have a blank body with the pdf attachment.

 

Anyone have any ideas as to how to test for a blank body, or one with only whitespace characters?  The new PCRE function can do it, but we're still on 2.0.6 at the moment, waiting until IMail 2006.21 comes out and passes testing.

 

I'm thinking a blank body test with PDF attachment detection should result in very few FPs.  Still possible, but hopefully enough to hold on until a better detection method can be found.


Darin.

 


_____________

Test footer


---

This E-mail came from the Declude.JunkMail mailing list. To

unsubscribe, just send an E-mail to [EMAIL PROTECTED], and

type "unsubscribe Declude.JunkMail". The archives can be found

at http://www.mail-archive.com


---

This E-mail came from the Declude.JunkMail mailing list. To

unsubscribe, just send an E-mail to [EMAIL PROTECTED], and

type "unsubscribe Declude.JunkMail". The archives can be found

at http://www.mail-archive.com.


---

This E-mail came from the Declude.JunkMail mailing list. To

unsubscribe, just send an E-mail to [EMAIL PROTECTED], and

type "unsubscribe Declude.JunkMail". The archives can be found

at http://www.mail-archive.com. 


---

This E-mail came from the Declude.JunkMail mailing list. To

unsubscribe, just send an E-mail to [EMAIL PROTECTED], and

type "unsubscribe Declude.JunkMail". The archives can be found

at http://www.mail-archive.com. 


---
This E-mail came from the Declude.JunkMail mailing list. To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail". The archives can be found
at http://www.mail-archive.com.

Reply via email to