Latin.traineddata <https://github.com/tesseract-ocr/tessdata_fast/blob/master/script/Latin.traineddata> can be found under script folder.
https://github.com/tesseract-ocr/tessdata_fast On Friday, May 25, 2018 at 5:02:29 AM UTC-5, Thomas Güttler wrote: > > Hi Shree, > > what do you mean with "script/Latin traineddata"? I am new to tesseract > and use version 4.0 via docker. > Most internet pages are about tesseract 3.0.x. > > I am unsure where to start. > > Maybe it is better to use 3.0.x? > > Regards, > Thomas > > Am Donnerstag, 24. Mai 2018 13:41:30 UTC+2 schrieb shree: >> >> Please try with script/Latin traineddata to see if you get better results. >> >> I have added your comment to issue at >> https://github.com/tesseract-ocr/langdata/pull/54 >> >> >> >> On Thursday, May 24, 2018 at 5:05:55 PM UTC+5:30, Thomas Güttler wrote: >>> >>> I use tesseract 4.0 via docker (tesseractshadow/tesseract4re) >>> >>> Very often tesseract detects "StraBe" instead of "Straße". >>> >>> Yes, I use -l=deu >>> >>> The word "Straße" is very common in german. It means "street". >>> >>> Since "StraBe" makes no sense I would like to improve this. >>> >>> What do you suggest? >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/fa7f1004-df61-4bc8-b039-3ef39f64b909%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.