I had a similar, less expensive thought;  Checking the global color
table in the header of all of the gif images in a particular message.  I
tested a couple of spam cases and the GCTs are identical in all of my
limited number of test cases.


Logan Shaw wrote:
> Looks like people have started to get a grip on the image
> spams that are so popular lately, but here's an additional
> idea I thought I'd toss out.  (I'm not familiar enough with
> SA to easily figure out how to make a plugin.)
> 
> Basically, these spams all have a bunch of images which are
> tiles of a larger image.  The tiling thing is, presumably, done
> to avoid checksumming.  Now, here's the thing with tiling: the
> left edge of one image will be extremely similar to the right
> edge of the one next to it.  And same with top and bottom edges.
> 
> So it seems like a useful rule could decompress each of the
> images, take the left and right columns and top and bottom rows
> of each image, and compare those columns and rows to columns
> and rows other images of similar dimensions.  If they correlate
> closely (determined easily enough by subtracting one set of
> pixels from the next), that's a strong indicator they were
> expected to abut, which in turn is a strong indicator of spam.
> 
> Of course, this requires decoding the entire image, but the
> analysis after that point should be fairly cheap (compared to
> OCR, for example).
> 
>   - Logan

Reply via email to