AFAIK Ray is involved in other projects at Google. Unlikely to get a reply from him.
See https://github.com/tesseract-ocr/tesstrain/wiki for training done by @stweil on similar scale for Fraktur. The pages list the hardware requirements, time taken etc. Please check that you have enough resources to try and replicate the LSTM training. On Wed, Mar 25, 2020 at 11:41 AM Essam Zaky <[email protected]> wrote: > Thanks @shreeshrii > > Would answer the questions depending on your experience , > > also is it possible to get help from Ray ? > > > بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky: >> >> Hi Dears , >> >> I would like to build *.traindata from scratch specially for English and >> Arabic >> >> So lets talk about English as example >> my question how to prepare fonts folder? >> >> i read the >> https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh >> file >> i found the this file contain about only 32 font name >> should i add other Latin fonts installed in the training machine to the >> previous file "language-specific.sh" ? >> >> >> i used "font manger" tool and i found about 147 font installed in >> training machine >> i opended >> https://github.com/tesseract-ocr/langdata_lstm/blob/master/eng/okfonts.txt >> and it contain 4567 font name >> should i search and download and install all missing fonts in the >> training machine ? >> >> should i collect all fonts files from training machine and create new >> fonts folder "HOME/.fonts" and paste all fonts in that folder? >> >> i see fonts have diffirent extentions "*.ttf , *.otf , *.afm , ... " >> does all font types work in training or i need specific type ? >> >> >> I will write another question about the required text data . >> >> Thanks for help >> >> >> >> Regards >> Essam >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/6512c5b3-df3b-4702-afa9-6d9f5c4d035f%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/6512c5b3-df3b-4702-afa9-6d9f5c4d035f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWANX9%2BqMBcxGpoq8b8CBQn4WYRYs0-cyi75FJuV5PiOA%40mail.gmail.com.

