Hi, as I found Tesseract to be the probably best free OCR available, I decided to write "gui-tesseract" - a little graphical frontend in python.
Planned features: 1) Batch processing over all *.jpg, *.jpeg images in selected directory. 2) Optionally crop, rotate, normalize (etc?) the image using imagemagick. (The crop region can be visually chosen on the image preview, as the gui-tesseract expects the images were scanned/ photographed so that the text regions are on the same position.) 3) Simple interface - in case of well-prepared images, it only takes one click to process them. 4) Automatic numbering of output pages (all/even/odd). Tool to merge multiple *.txt pages into one (yes, using cat). I expect the code not to exceed 20 kB. Would it be possible to distribute this script with Tesseract? It could provide effective OCR function for many (not only newbie) Linux users. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

