Hi, as I found Tesseract to be the probably best free OCR available, I
decided to write "gui-tesseract" - a little graphical frontend in
python.

Planned features:
1) Batch processing over all *.jpg, *.jpeg images in selected
directory.
2) Optionally crop, rotate, normalize (etc?) the image using
imagemagick. (The crop region can be visually chosen on the image
preview, as the gui-tesseract expects the images were scanned/
photographed so that the text regions are on the same position.)
3) Simple interface - in case of well-prepared images, it only takes
one click to process them.
4) Automatic numbering of output pages (all/even/odd). Tool to merge
multiple *.txt pages into one (yes, using cat).

I expect the code not to exceed 20 kB. Would it be possible to
distribute this script with Tesseract? It could provide effective OCR
function for many (not only newbie) Linux users.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to