[tesseract-ocr] Next problem with training (tesseract 4.0)

Adam Funk Tue, 17 Sep 2019 07:54:19 -0700

Hi again,

Using the instructions at
<https://www.endpoint.com/blog/2018/07/09/training-tesseract-models-from-scratch>,
I'm getting a bit further, but when my script gets to this part:


combine_lang_model \
  --input_unicharset "${UNICHARSET_FILE}" \
  --script_dir "${TESSDATA_PREFIX}" \
  --output_dir "${OUTPUT_DIR}" \
  --pass_through_recoder \
  --lang "${LANG_CODE}"

it fails with this error:

Config file is optional, continuing...
Failed to read data from: /home/adam/sandboxes/TEST/tessdata/mem/mem.config
Failed to read data from:
/home/adam/sandboxes/TEST/tessdata/radical-stroke.txt
Error reading radical code table
/home/adam/sandboxes/TEST/tessdata/radical-stroke.txt


I can't figure out from these instructions or the tesseract
documentation on github where the mem.config and radical-stroke.txt
files are supposed to come from.  Any help would be greatly appreciated!

Also, the previous tesseract command is creating the *.lstmf files in
the same directory as the *.box and *.tif files --- are they supposed to
be in the TESSDATA_PREFIX directory instead?

Thanks,
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b685cfec-0144-fc06-b90f-e9ba54771316%40sheffield.ac.uk.

[tesseract-ocr] Next problem with training (tesseract 4.0)

Reply via email to