cranky old OCR engine that apparently sucks less than most other ones out there ??. friend of mine asked for it in response to seeing something on groklaw where they used it with image-based PDFs and xpdf or something to snarf the text out of them
without the stuff in ${SUPDISTFILES}, the user has to train the
OCR engine, which is reasonably documented on their wiki but
also looks laborious and annoying if you don't otherwise need
that level of accuracy, hence grabbing the SUPDISTFILE stuff.
apache license 2.0
--
jared
tesseract-ocr.tgz
Description: application/tar-gz
