Kenny> Could we extract a list of text tokens from each frame
    Kenny> separately, and then choose the token list that has the most
    Kenny> tokens in it?

In theory, yes, though that would require running ocrad on each possibly
partial image (could get expensive) and would require code restructuring.
At the moment, the images come in one of three forms:

    * a single non-blinking image

    * a set of images, non-blinking, which, when assembled, make a single
      larger image

    * a single blinking image

Right now, I assume there might be multiple parts to the image, so I convert
from the source to PIL's internal format, concatenate them together, then
run ocrad on the total image.

I imagine it's not going to be long before the spammers start splitting up
their blinking images into parts.

Skip
_______________________________________________
spambayes-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/spambayes-dev

Reply via email to