-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 04/01/12 20:14, Robert T wrote: > I'm developing an open source Android app that uses Tesseract 3.01 > for OCR by passing Tesseract images captured by a phone or tablet > camera. > > The OCR is working adequately for small segments of text--like a > few words--but uneven illumination seems to lower the recognition > quality with larger text input. Because the input comes from the > device camera, there's a lot of shadows and glare.
Hi I had the same problem - took pictures of all the pages in a book, and had trouble with this. Tried out http://www.fmwconcepts.com/imagemagick/textcleaner/index.php and that worked well. just ran "./textcleaner badimg.jpg goodimg.jpg" and that was it :D for auto-recognition, deskewing etc I ran unpaper http://unpaper.berlios.de/ This made it possible to OCR 1500 pages basically without much trouble :D my 2 cents :) best Arno -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8FydkACgkQEMIGVCc8BjALEQCff/n4acxqfL9J5ZrCNLPrJxYx FcIAn1w6/ZVPax2UyMXeC4rSZztsFfqZ =gP0W -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

