English does not have a config file. It is optional. Only used in some languages.
On Fri, Sep 20, 2019, 16:43 J Adam Funk <[email protected]> wrote: > Hi again, > > I've tried using combine_tessdata -u to unpack the contents of the > standard eng.trainedddata to use as a "starter" for all the required files, > but the combine_lang_model is still failing with "Failed to read data from: > /home/adam/sandboxes/TEST/tessdata/eng/eng.config" error. (I have the > setting TESSDATA_PREFIX="/home/adam/sandboxes/TEST/tessdata".) Where do I > get that file? > > Thanks, > Adam > > > On Tuesday, 17 September 2019 16:38:19 UTC+1, shree wrote: >> >> config files are there some languages. They will be in langdata or >> langdata_lstm repos. radical_stroke.txt is also there. >> >> You can also look at training instructions in wiki or in >> shreeshrii/tess4training >> >> >> On Tue, Sep 17, 2019, 20:24 Adam Funk <[email protected]> wrote: >> >>> Hi again, >>> >>> Using the instructions at >>> < >>> https://www.endpoint.com/blog/2018/07/09/training-tesseract-models-from-scratch >>> >, >>> I'm getting a bit further, but when my script gets to this part: >>> >>> combine_lang_model \ >>> --input_unicharset "${UNICHARSET_FILE}" \ >>> --script_dir "${TESSDATA_PREFIX}" \ >>> --output_dir "${OUTPUT_DIR}" \ >>> --pass_through_recoder \ >>> --lang "${LANG_CODE}" >>> >>> it fails with this error: >>> >>> Config file is optional, continuing... >>> Failed to read data from: >>> /home/adam/sandboxes/TEST/tessdata/mem/mem.config >>> Failed to read data from: >>> /home/adam/sandboxes/TEST/tessdata/radical-stroke.txt >>> Error reading radical code table >>> /home/adam/sandboxes/TEST/tessdata/radical-stroke.txt >>> >>> >>> I can't figure out from these instructions or the tesseract >>> documentation on github where the mem.config and radical-stroke.txt >>> files are supposed to come from. Any help would be greatly appreciated! >>> >>> Also, the previous tesseract command is creating the *.lstmf files in >>> the same directory as the *.box and *.tif files --- are they supposed to >>> be in the TESSDATA_PREFIX directory instead? >>> >>> Thanks, >>> Adam >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/b685cfec-0144-fc06-b90f-e9ba54771316%40sheffield.ac.uk >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/03f6f8cc-ad82-4338-9e72-9db7bb62ac9a%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/03f6f8cc-ad82-4338-9e72-9db7bb62ac9a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVhpS8ChHd1xJ9L_vYzLFEA7dNiHLWLnvVNjXBoi9GdNQ%40mail.gmail.com.

