Hi Shree, what do you mean with "script/Latin traineddata"? I am new to tesseract and use version 4.0 via docker. Most internet pages are about tesseract 3.0.x.
I am unsure where to start. Maybe it is better to use 3.0.x? Regards, Thomas Am Donnerstag, 24. Mai 2018 13:41:30 UTC+2 schrieb shree: > > Please try with script/Latin traineddata to see if you get better results. > > I have added your comment to issue at > https://github.com/tesseract-ocr/langdata/pull/54 > > > > On Thursday, May 24, 2018 at 5:05:55 PM UTC+5:30, Thomas Güttler wrote: >> >> I use tesseract 4.0 via docker (tesseractshadow/tesseract4re) >> >> Very often tesseract detects "StraBe" instead of "Straße". >> >> Yes, I use -l=deu >> >> The word "Straße" is very common in german. It means "street". >> >> Since "StraBe" makes no sense I would like to improve this. >> >> What do you suggest? >> >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/07cc4012-a837-4a90-9f03-37cb066f0962%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

