Thanks, Arno! That looks like a very well designed script. --Sven On Thu, Jan 5, 2012 at 10:03 AM, Arno Teigseth <[email protected]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 04/01/12 20:14, Robert T wrote: >> I'm developing an open source Android app that uses Tesseract 3.01 >> for OCR by passing Tesseract images captured by a phone or tablet >> camera. >> >> The OCR is working adequately for small segments of text--like a >> few words--but uneven illumination seems to lower the recognition >> quality with larger text input. Because the input comes from the >> device camera, there's a lot of shadows and glare. > > Hi I had the same problem - took pictures of all the pages in a book, > and had trouble with this. > > Tried out > > http://www.fmwconcepts.com/imagemagick/textcleaner/index.php > > and that worked well. > > just ran > "./textcleaner badimg.jpg goodimg.jpg" > > and that was it :D > > > for auto-recognition, deskewing etc I ran unpaper > > http://unpaper.berlios.de/ > > > This made it possible to OCR 1500 pages basically without much trouble :D > > my 2 cents :) > > best > Arno > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk8FydkACgkQEMIGVCc8BjALEQCff/n4acxqfL9J5ZrCNLPrJxYx > FcIAn1w6/ZVPax2UyMXeC4rSZztsFfqZ > =gP0W > -----END PGP SIGNATURE----- >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

