Re: [CODE4LIB] web-based ocr

chris fitzpatrick Wed, 13 Mar 2013 07:14:13 -0700

I recommend looking at pdfbeads. It's in ruby and the documentation ismostly in Russian (http://rubyforge.org/docman/view.php/9752/10692/pdfbeads.ru.html ), butit provides both a library and an easy to use executiable to build PDFsout of hOCR files and images. You literally just point it at a directorywith page images and hOCR files and it spits out a PDF. Very handy.

Also, the DIY Book Scanner forum (diybookscanner.org ) is a greatresource if you're into these sorts of things...




Eric Lease Morgan wrote:

On Mar 13, 2013, at 8:07 AM, Ben Brumfield<[email protected]> wrote:
https://github.com/idigbio-aocr/RESTAPI/tree/master/doc
Interesting. Printed for future reference. Thank you.
BTW, I did finally get Image::OCR::Tesseract to make, make test, andmake install correctly. I did not have the correct/proper librariesinstalled for Tesseract's supporting Leptonica library. Now I need tofind a PDF library similar to libtff and libpng.
--
Eric Morgan

Re: [CODE4LIB] web-based ocr

Reply via email to