Hi,

On a different note concerning images, what about an email filter logging the
possibility of the images containing hidden data (i.e. Steganography test).

I already log possible text (I count alphanummeric chars in the ocr output)

+header         SPAMPIC_ALPHA_1         OCR-Output =~ /OCRTEXT: more than 
alpha1 chars found/
+describe       SPAMPIC_ALPHA_1         Image contains many alphanumeric chars
+score          SPAMPIC_ALPHA_1         0.500
+
+header         SPAMPIC_ALPHA_2         OCR-Output =~ /OCRTEXT: more than 
alpha2 chars found/
+describe       SPAMPIC_ALPHA_2         Image contains many alphanumeric chars
+score          SPAMPIC_ALPHA_2         1.000
+
+header         SPAMPIC_ALPHA_3         OCR-Output =~ /OCRTEXT: more than 
alpha3 chars found/
+describe       SPAMPIC_ALPHA_3         Image contains many alphanumeric chars
+score          SPAMPIC_ALPHA_3         1.500

You could now do a statistic analytic to see if the chars match any language specific char occurance to see if its really text.

Martin
_______________________________________________
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list [email protected]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

Reply via email to