Hi there, I am working on an Android Application where a user can scan the package of medicaments. With the help of Tesseract OCR the name, dose and medicament sort (if it's pills or capsules..) should be read.
So I implemented that the user can make a picture with the device camera. This Picture is down scaled to a width of 800 pixels, will be grayscaled an binarized by Sauvola-Binarization. Because the OCR readed result are not perfect (for example many time it doesn't read the dose like 10mg or 20ml) I thoguht it would be great, if the user can specify areas of the image which contains the important informations. If the user mark an area as dose the ocr could read this subimage. The problem is, if I use the complete picture it reads the text fine ( the name of the product for example). But if I use the names subimage it couldnt read anything. The size of the text in the subimage is the same as in the the original image. You can see it in this pictures: <https://lh3.googleusercontent.com/-GUBE4-T5V14/UoCfMRhd3DI/AAAAAAAAAAM/C329ElkAMWI/s1600/Screenshot_2013-11-11-10-06-02.png> Here you can see the original binarized image. On top you see the result of OCR. As you can see Lasix was read as LasiX whats more than okay! <https://lh5.googleusercontent.com/-1ImgVm-iKM4/UoCfYYPzF9I/AAAAAAAAAAU/ZmQ64_xHVJk/s1600/Screenshot_2013-11-11-10-06-24.png> But here you can see, its the Lasix text as an subimage of the other one. But nothing was read. Why? By the way one other question. I'd seen the setVariable method for tess. And I'd seen that TessBaseAPI.VAR_ACCURACYVSPEED exists. But I couldn't find which values are accepted for this variable. Does anyone know? Thank you all in andvance. Best regards -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

