Tesseract is an open source OCR program. It can already
produce searchable PDF and will soon support streaming.
It would be fun to support something like this:

   scanimage --batch | tesseract - - pdf > searchable.pdf

To make this work nicely, scanimage would need to
print the name of each file to stdout after it is written.

Thoughts?

Jeff
-- 
sane-devel mailing list: [email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
             to [email protected]

Reply via email to