Hi all,

I'm working on training Tesseract for Georgian script. I was wondering what 
type of input I should use to make the punc-dawg and number-dawg dictionaries. 
Right now, I have a word dictionary, which has eliminated some errors, but it 
has also caused Tesseract to ignore punctuation in some cases. I am hoping that 
providing a punc-dawg is the solution, but I haven't been able to find a good 
resource for this, either in the list archives or in the source files.

Can anyone tell me what type of file I should use to create the punc-dawg and 
number-dawg files?

Thanks!

Derek

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to