On Friday, July 4, 2014 2:48:41 AM UTC+5:30, Nick White wrote:
>
> On Wed, Jul 02, 2014 at 10:26:16PM -0700, Meenal Goyal wrote: 
> > The post about "question about training tesseract" only suggests some 
> > pre-processing steps which include binarisation and  I have already 
> tried them. 
> > I wanted to know if anything can be done to improve output at later 
> stage, 
> > something like adding the words to the dictionary used by tesseract. 
>
> OK, I see. The reason I recommended binarisation is that I suspect 
> you'll have a lot more luck with that than anything else, for your 
> problems. 
>
> I have tried binarisation and it was surely helpful in improving the 
output.
 

> > I have tried listing words in eng.user-words but it wasn't much useful. 
> Can you 
> > suggest anything of this sort which can train tesseract over the time 
> and help 
> > improve the output. 
>
> If you're sure that all the words you will encounter will be in the 
> dictionary this should help somewhat: 
>
> https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_to_increase_the_trust_in/strength_of_the_dictionary?
>  
>
>
>  
The words won't always be in dictionary so I tried adding them in file 
eng.user-words but i m confused about the weightage given to this file 
against the already defined dictionaries.
Also, I have read that post earlier about strengthening the dictionary and 
tried to modify some variables in the configuration file.  But then it 
starts recognizing wrong words, may be its the case of over-correcting.
 

> Nick 
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4fb884d6-3663-4822-8ebd-c0253c747849%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to