the output  of

*src/training/tesstrain.sh  --fontlist "Times New Roman" --lang eng
--linedata_only   --noextract_font_properties --langdata_dir
/home/sw/repo/langdata   --tessdata_dir /home/sw/repo/tessdata --output_dir
~/tesstutorial/trainplusminus*

is

....
....










*[Tue Jun 18 17:19:46 EET 2019] /usr/local/bin/combine_lang_model
--input_unicharset /tmp/eng-2019-06-18.baG/eng.unicharset --script_dir
/home/sw/repo/langdata --words /home/sw/repo/langdata/eng/eng.wordlist
--numbers /home/sw/repo/langdata/eng/eng.numbers --puncs
/home/sw/repo/langdata/eng/eng.punc --output_dir
/home/sw/tesstutorial/trainplusminus --lang engLoaded unicharset of size
111 from file /tmp/eng-2019-06-18.baG/eng.unicharsetSetting unichar
propertiesOther case É of é is not in unicharsetSetting script
propertiesWarning: properties incomplete for index 95 = ~Config file is
optional, continuing...Failed to read data from:
/home/sw/repo/langdata/eng/eng.configNull char=2Reducing Trie to
SquishedDawgError during conversion of wordlists to DAWGs!!*

On Tue, Jun 18, 2019 at 5:18 PM Shree Devi Kumar <[email protected]>
wrote:

> That means
>
> src/training/tesstrain.sh  --fontlist "Times New Roman" --lang eng
> --linedata_only   --noextract_font_properties --langdata_dir
> /home/sw/repo/langdata   --tessdata_dir /home/sw/repo/tessdata --output_dir
> ~/tesstutorial/trainplusminus
>
> did not complete correctly.
>
> On Tue, Jun 18, 2019 at 8:46 PM fady taher <[email protected]> wrote:
>
>> Nop, this file doesn't exist yet
>> only contains
>>
>> *eng.charset_size=110.txt*
>> *eng.unicharset*
>>
>>
>> On Tue, Jun 18, 2019 at 4:46 PM Shree Devi Kumar <[email protected]>
>> wrote:
>>
>>> Check ~/tesstutorial/trainplusminus
>>> Did your earlier training complete correctly? Does
>>> ~/tesstutorial/trainplusminus/eng/eng.traineddata exist?
>>>
>>> On Tue, Jun 18, 2019 at 8:11 PM fady taher <[email protected]>
>>> wrote:
>>>
>>>> Am trying to fine tune tesseract
>>>>
>>>> but I keep getting the error 
>>>> *mgr_.Init(traineddata_path.c_str()):Error:Assert
>>>> failed:in file ../../src/lstm/lstmtrainer.h, line 110  *on the
>>>> training statement.
>>>>
>>>> My script looks as follows
>>>>
>>>> cd /home/sw/repo/tesseract-ocr
>>>>
>>>> mkdir -p ~/tesstutorial/
>>>> mkdir -p ~/tesstutorial/trainplusminus
>>>> mkdir -p ~/tesstutorial/evalplusminus
>>>>
>>>>
>>>> src/training/tesstrain.sh  --fontlist "Times New Roman" --lang eng
>>>> --linedata_only   --noextract_font_properties --langdata_dir
>>>> /home/sw/repo/langdata   --tessdata_dir /home/sw/repo/tessdata --output_dir
>>>> ~/tesstutorial/trainplusminus
>>>>
>>>> src/training/tesstrain.sh  --fontlist "Times New Roman" --lang eng
>>>> --linedata_only   --noextract_font_properties --langdata_dir
>>>> /home/sw/repo/langdata/eng   --tessdata_dir /home/sw/repo/tessdata
>>>>  --output_dir ~/tesstutorial/evalplusminus
>>>>
>>>>
>>>> *#eng.lstm file gets extracted correctly*
>>>> src/training/combine_tessdata -e
>>>> /home/sw/repo/tessdata/eng.traineddata
>>>>  ~/tesstutorial/trainplusminus/eng.lstm
>>>>
>>>> *#this command fails and throws the error*
>>>> src/training/lstmtraining --model_output
>>>> ~/tesstutorial/trainplusminus/plusminus \
>>>>    --continue_from ~/tesstutorial/trainplusminus/eng.lstm  \
>>>>    --traineddata ~/tesstutorial/trainplusminus/eng/eng.traineddata   \
>>>>    --old_traineddata /home/sw/repo/tessdata/eng.traineddata   \
>>>>    --train_listfile
>>>> ~/tesstutorial/trainplusminus/eng.training_files.txt   \
>>>>    --max_iterations 400
>>>>
>>>>
>>>> src/training/lstmtraining --stop_training \
>>>>   --continue_from ~/tesstutorial/trainplusminus/plusminus_checkpoint \
>>>>   --traineddata ~/tesstutorial/trainplusminus/eng/eng.traineddata \
>>>>   --model_output ~/tesstutorial/eng_final.traineddata
>>>>
>>>> cp ~/tesstutorial/eng_final.traineddata
>>>> /usr/share/tesseract/4/tessdata/eng.traineddata
>>>>
>>>>
>>>> I have download the eng.traineddata from "Best" repo though, anyone can
>>>> help ?
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/00310d99-1fc9-402f-b0fa-d048486d77b2%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/00310d99-1fc9-402f-b0fa-d048486d77b2%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>>
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUyFr_891kXw-cLkAU13JoTSj6temm92hEWfP%3DBtZmGHA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUyFr_891kXw-cLkAU13JoTSj6temm92hEWfP%3DBtZmGHA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CADhGFTw_1TR96f%3DUTC6k5Pm4GssLvd2NXZ0s9oyMknUBFrtLHQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CADhGFTw_1TR96f%3DUTC6k5Pm4GssLvd2NXZ0s9oyMknUBFrtLHQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
>
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWQ_po%3DauX3tYaJf9kB_-06inWFMS%2BDKx_RWYMTWZvrmw%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWQ_po%3DauX3tYaJf9kB_-06inWFMS%2BDKx_RWYMTWZvrmw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CADhGFTymmYjFaJBBdpxkt2gVkSP4dFLYri-BD3r2bjM5ZCOgPg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to