English does not have a config file. It is optional. Only used in some
languages.

On Fri, Sep 20, 2019, 16:43 J Adam Funk <[email protected]> wrote:

> Hi again,
>
> I've tried using combine_tessdata -u to unpack the contents of the
> standard eng.trainedddata to use as a "starter" for all the required files,
> but the combine_lang_model is still failing with "Failed to read data from:
> /home/adam/sandboxes/TEST/tessdata/eng/eng.config" error. (I have the
> setting TESSDATA_PREFIX="/home/adam/sandboxes/TEST/tessdata".)  Where do I
> get that file?
>
> Thanks,
> Adam
>
>
> On Tuesday, 17 September 2019 16:38:19 UTC+1, shree wrote:
>>
>> config files are there some languages. They will be in langdata or
>> langdata_lstm repos. radical_stroke.txt is also there.
>>
>> You can also look at training instructions in wiki or in
>> shreeshrii/tess4training
>>
>>
>> On Tue, Sep 17, 2019, 20:24 Adam Funk <[email protected]> wrote:
>>
>>> Hi again,
>>>
>>> Using the instructions at
>>> <
>>> https://www.endpoint.com/blog/2018/07/09/training-tesseract-models-from-scratch
>>> >,
>>> I'm getting a bit further, but when my script gets to this part:
>>>
>>> combine_lang_model \
>>>   --input_unicharset "${UNICHARSET_FILE}" \
>>>   --script_dir "${TESSDATA_PREFIX}" \
>>>   --output_dir "${OUTPUT_DIR}" \
>>>   --pass_through_recoder \
>>>   --lang "${LANG_CODE}"
>>>
>>> it fails with this error:
>>>
>>> Config file is optional, continuing...
>>> Failed to read data from:
>>> /home/adam/sandboxes/TEST/tessdata/mem/mem.config
>>> Failed to read data from:
>>> /home/adam/sandboxes/TEST/tessdata/radical-stroke.txt
>>> Error reading radical code table
>>> /home/adam/sandboxes/TEST/tessdata/radical-stroke.txt
>>>
>>>
>>> I can't figure out from these instructions or the tesseract
>>> documentation on github where the mem.config and radical-stroke.txt
>>> files are supposed to come from.  Any help would be greatly appreciated!
>>>
>>> Also, the previous tesseract command is creating the *.lstmf files in
>>> the same directory as the *.box and *.tif files --- are they supposed to
>>> be in the TESSDATA_PREFIX directory instead?
>>>
>>> Thanks,
>>> Adam
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/b685cfec-0144-fc06-b90f-e9ba54771316%40sheffield.ac.uk
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/03f6f8cc-ad82-4338-9e72-9db7bb62ac9a%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/03f6f8cc-ad82-4338-9e72-9db7bb62ac9a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVhpS8ChHd1xJ9L_vYzLFEA7dNiHLWLnvVNjXBoi9GdNQ%40mail.gmail.com.

Reply via email to