Hi I've trained tesseract to a new language, and I'm impressed with the quality :)
Preparing the originals I badly wanted an easy-chop-tool, to select only the interesting region[s] of the scanned page. However I couldn't find a very easy to use tool, and it takes "ages" to start GIMP etc.. So I took the source code of mirage (python) and rewrote it a bit. This is my first day looking at python code, so please bear with my bad coding. The result: You can do "mirage infile.jpg", and the crop dialog pops up immediately. Selecting a region and pressing the "Crop to new" button copies the selected region to a new image. Then you can select another region. Now, are there any other good tools to do this pre-OCR image conditioning? anyway, the source is available on http://arno.homelinux.org/files/tesseract (not the worlds stablest server) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

