Failed to read is a generic warning from the common file read routine (as far as I know)
https://github.com/Shreeshrii/tess4training/blob/master/1-makedata.log shows === Constructing LSTM training data === Creating new directory ../tesstutorial/engtrain [Mon Apr 1 08:36:38 UTC 2019] /usr/local/bin/combine_lang_model --input_unicharset /tmp/eng-2019-04-01.Q4Z/eng.unicharset --script_dir ../langdata --words ../langdata/eng/eng.wordlist --numbers ../langdata/eng/eng.numbers --puncs ../langdata/eng/eng.punc --output_dir ../tesstutorial/engtrain --lang eng Loaded unicharset of size 111 from file /tmp/eng-2019-04-01.Q4Z/eng.unicharset Setting unichar properties Other case É of é is not in unicharset Setting script properties Warning: properties incomplete for index 25 = ~ Config file is optional, continuing... Failed to read data from: ../langdata/eng/eng.config Null char=2 Reducing Trie to SquishedDawg Reducing Trie to SquishedDawg Reducing Trie to SquishedDawg On Fri, Sep 20, 2019 at 6:42 PM J Adam Funk <a.f...@sheffield.ac.uk> wrote: > OK, so that "Failed..." is just a warning. > Thanks! > > > On Tuesday, 17 September 2019 16:38:19 UTC+1, shree wrote: >> >> config files are there some languages. They will be in langdata or >> langdata_lstm repos. radical_stroke.txt is also there. >> >> You can also look at training instructions in wiki or in >> shreeshrii/tess4training >> >> >> On Tue, Sep 17, 2019, 20:24 Adam Funk <a....@sheffield.ac.uk> wrote: >> >>> Hi again, >>> >>> Using the instructions at >>> < >>> https://www.endpoint.com/blog/2018/07/09/training-tesseract-models-from-scratch >>> >, >>> I'm getting a bit further, but when my script gets to this part: >>> >>> combine_lang_model \ >>> --input_unicharset "${UNICHARSET_FILE}" \ >>> --script_dir "${TESSDATA_PREFIX}" \ >>> --output_dir "${OUTPUT_DIR}" \ >>> --pass_through_recoder \ >>> --lang "${LANG_CODE}" >>> >>> it fails with this error: >>> >>> Config file is optional, continuing... >>> Failed to read data from: >>> /home/adam/sandboxes/TEST/tessdata/mem/mem.config >>> Failed to read data from: >>> /home/adam/sandboxes/TEST/tessdata/radical-stroke.txt >>> Error reading radical code table >>> /home/adam/sandboxes/TEST/tessdata/radical-stroke.txt >>> >>> >>> I can't figure out from these instructions or the tesseract >>> documentation on github where the mem.config and radical-stroke.txt >>> files are supposed to come from. Any help would be greatly appreciated! >>> >>> Also, the previous tesseract command is creating the *.lstmf files in >>> the same directory as the *.box and *.tif files --- are they supposed to >>> be in the TESSDATA_PREFIX directory instead? >>> >>> Thanks, >>> Adam >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesser...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/b685cfec-0144-fc06-b90f-e9ba54771316%40sheffield.ac.uk >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/5e4bc187-cc72-4d3b-b91a-73e1bc49cc1a%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/5e4bc187-cc72-4d3b-b91a-73e1bc49cc1a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXctNB5bWpYhDo1RvMppJVysKErh4%2Ba67M%3DHT-zfE%2BNww%40mail.gmail.com.