https://github.com/tesseract-ocr/langdata_lstm has the files used.
On Fri, Oct 18, 2019 at 9:39 AM Shree Devi Kumar <[email protected]> wrote: > See > https://github.com/tesseract-ocr/tesseract/issues/654#issuecomment-274574951 > > > On Fri, Oct 18, 2019 at 9:10 AM 'abram stern' via tesseract-ocr < > [email protected]> wrote: > >> Hi tesseract community, >> >> I'm working on a research project about OCR and I'm wondering where the >> included data models (eg 'fast', 'best') come from -- or put another way, >> what source material is used for training them? I haven't been able to >> find this documented anywhere and am interested to know if it involves >> public domain corpora, data obtained through book scanning, or other >> sources. >> >> Best regards, >> Abram >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/bdb45c2b-1764-4384-95e5-a5d884e2c5ab%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/bdb45c2b-1764-4384-95e5-a5d884e2c5ab%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVj7UL8hMRD5JgR-Zn6UvhUJeSpxjzFQUg%3D-XW_vV05hg%40mail.gmail.com.

