Sorry for "uping" the post like this... But I really need some help  ASAP!
Any guesses? At least something about the parameters?

Thanks a lot!

- Romeo

Em sexta-feira, 15 de fevereiro de 2013 10h07min40s UTC-8, Romeo Jihara 
escreveu:
>
> Hi all,
>
> I am trying to detect text that is overlaid on top of images. A common 
> example is memes like the ones here: http://www.quickmeme.com/memes/
> The goal is to produce a high quality bounding box prediction and, if 
> possible, generate OCR. Please note that I'm much more interested in the 
> former!
> I am trying to use Tesseract for that.
>
> What makes the problem challenging is that the background can be anything. 
> In addition the text can have a stroke and a fill of arbitrary color.
> My questions are:
> 1) Tesseract has tons of different parameters. What is a set of important 
> parameters to tune for this case and what are good values for them?
> 2) How do I preprocess the image? I was a bit surprised to find out that 
> converting the image to grayscale before passing it to Tesseract results in 
> different (and generally better) accuracy. Why? Also inverting the image 
> works better for some text. What are the set of important transformations 
> to play with?
> 3) I noticed that often Tesseract is able to detect sequences of words but 
> not combine them together. What parameter affects the probability of 
> combining adjacent words together.
> 4) Is it worth doing morphological transformations, such as trying to get 
> rid of the text stroke, or does Tesseract handle text strokes?
> 5) When I call getRegions does it also perform OCR to give me better 
> confidence predictions of the text boxes?
> 6) Does Tesseract use the OCR output in determining the confidence of a 
> region being true text? Looking at the results I get it seems like it is 
> possible to improve the next confidence by building an n-gram model. Also 
> some characters (like punctuation points) are highly indicative of false 
> positive text regions. Is there such built-in functionality or should I 
> build one?
> Similarly the size and relative locations of text can also be used to 
> refine the confidence. It appears from my tests that often small and 
> disjoint text areas (and ones that are not horizontally aligned with 
> others) are false positives. Again, is there such built-in heuristic or 
> should I build one? 
>
> I am attaching a couple of examples that show the text localization 
> results whit different preprocessing applied to the image. The numbers 
> inside each box is the confidence for that region, also blue boxes means 
> confidence > 75  and red boxes <= 75. I'm also sending the parameters used 
> in all these detections.
>
> Thanks for your time and for building such an awesome free OCR engine!
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to