Am Montag, 24. August 2015 22:18:13 UTC+2 schrieb [email protected]: > > I'm using tesseract on forms and I need to call setRectangle to retrieve > specific fields. Unfortunately the forms are sometimes translated, and > sometimes a bit rotated or other transformations. Is there a good primer I > can read on how to calculate the translation, rotation, etc? I'm assuming > it would involve taking a recognizable object that occurs on all of the > forms, such as a line, and calculating the position of this, then calling > setRectangle with coordinates calculated from the position of this > "landmark" object. (the line) >
You need the coordinates of the four corners which you can use as parameters to an unperspective method of the image library of your choice. Here is an example you can dig into the details (open source): http://www.fmwconcepts.com/imagemagick/unperspective/index.php -- scroll down to OCR like examples. Leptonica also has distortion examples: http://www.leptonica.com/dewarping.html http://www.leptonica.com/affine.html -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8d54ffb5-4293-43a3-b0c8-2358c0d2510e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

