Hi, I have written a little GUI for Tesseract in python. It is
available on 
http://ocropus.googlegroups.com/web/guitesseract.py?gsc=Z8kTHBYAAACg2FwGPC9XOsx4lRLfkDCCq9K8Kz9yQIr4tC0O5ImEZA.

Implemented features:
1) Batch processing using Tesseract over all *.jpg, *.jpeg images in
selected directory.
2) Optionally crop, rotate, normalize the image using imagemagick.
(The crop region can be visually chosen on the image preview, as the
program expects the images were scanned/
photographed so that the text regions are on the same position.)
3) Simple interface - in case of well-prepared images, it only takes
few clicks to process them.
4) Automatic numbering of output pages (all/even/odd).

Planned features:
1) Interaction with Ocropus, not only Tesseract.
2) Tree view of pages->regions->region_options, allowing the user to
visually select the region geometry or to check the layout analysis
returned correct results.
3) Some user friendly tool to train Tesseract/Ocropus to recognize new
languages and fonts. (I haven't made this idea clear yet.)
4) Options for constructing a complete document in plain text, ODT,
HTML, LaTeX. Each of these options would need a different approach of
outputting images (ODT embeds them, HTML needs PNG, LaTeX needs EPS/
PDF).
5) Some simple wizard for beginners.
6) I18n, packaging, distribution with Ocropus.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to