Hi Nick,

Thanks for your reply. It helped much! So I got some idea of setting 
variables which is listed in
http://stackoverflow.com/questions/13087252/where-i-can-find-the-list-of-available-property-name-for-tesseract-setvariable

And I would like to ask you a quick following questions.
Is there a way to give TesseractEngine a hint of expected text format? For 
example, can I set a format like 00XXX00 XX-000 where 0 represents number 
and X represents alphabet?


On Tuesday, February 23, 2016 at 1:00:22 AM UTC-8, Nick White wrote:
>
> Hi Devon, 
>
> On Mon, Feb 22, 2016 at 10:43:33AM -0800, Devon Yoo wrote: 
> > I have test set that only has "uppercase English alphabets" and 
> "numbers". But 
> > the provided eng.traineddata returns symbols and lower case alphabets 
> > sometimes. Is there a way to modify the existing traineddata file so 
> that it 
> > only reads upper case alphabets and numbers? 
>
> Use the 'tessedit_char_whitelist' config variable. You can create a 
> config file like the 'digits' one; 
>
> https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-do-i-recognize-only-digits
>  
> or just use 'tesseract -c tessedit_char_whitelist=ABC...123...' on 
> the command line. 
>
> Nick 
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8df165dd-eafa-4f64-9e2f-ba26f1f03cf1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to