When you say that the binarized image looked perfect but the accuracy was 
poor then my best guess is that the font used on those tickets is the 
culprit. I assume you can could try to create training data specifically 
for this special font.

Am Mittwoch, 1. Januar 2014 13:22:31 UTC+1 schrieb Muhammad Muaz:
>
> Hello, I am trying to recognized characters from the images taken from 
> *mobile 
> camera* at *72dpi* resolution with in 2-2.5 secs with complete 
> processing. Can be found in the following link
>
>    Tickets for 
> OCR<https://picasaweb.google.com/107072433218124342258/TicketsForOCR?authuser=0&feat=directlink>
>
> Ticket contains
>
>    - little bit bad light
>    - Non-text area
>    - less resolution
>    
> I tried to feed the image direct to tesseract API and it is giving me 70% 
> good results in 1sec average. But I want to increase the accuracy in 
> noticing the time factor
> So far I have tried
>
>    1. Detect edges of the image
>    2. Blob Analysis for blobs
>    3. Binarized the ticket using adaptive thresholding
>
> Then I tried to feed those binarized images to tesseract, the accuracy 
> reduced to less than 50-60%, though binarized image look perfect. I also 
> tried to look in to few research papers 
>
>    - http://www.vincent-net.com/luc/papers/10wiley_morpho_DIAapps.pdf
>    - 
>    
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.193.6347&rep=rep1&type=pdf
>    - http://iit.demokritos.gr/~bgat/PatRec2006.pdf
>    - http://psych.stanford.edu/~jlm/pdfs/Sternberg67.pdf
>
> but no luck. Kindly help me in this and sorry if my question is so basic. 
> Also I am trying not to use command line solution but I would prefer 
> *Leptonica 
> *and *OpenCV*.
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to