Tesseract is an open source OCR program. It can already produce searchable PDF and will soon support streaming. It would be fun to support something like this:
scanimage --batch | tesseract - - pdf > searchable.pdf To make this work nicely, scanimage would need to print the name of each file to stdout after it is written. Thoughts? Jeff
-- sane-devel mailing list: [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel Unsubscribe: Send mail with subject "unsubscribe your_password" to [email protected]
