I have not tried with english.

Please create an eng.config file in your langdata directory and then try

You can put the following 2 lines in it.

# Use LSTM
tessedit_ocr_engine_mode 1


ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sat, Aug 5, 2017 at 10:56 PM, Ava Nimaee <beigy.zoh...@gmail.com> wrote:

> thank for your attention
> i remove all and install again last version tesseract and leptonica and
> use this syntax
> training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng
>  --training_text training/langdata/eng/eng.training_text
> --linedata_only \
>   --noextract_font_properties --langdata_dir training/langdata \
>   --tessdata_dir ./tessdata \
>   --fontlist "Times New Roman," --output_dir ~/tesstutorial/engtrian
>
> but got a new error. all of things is ok but at the end took this:
>
> Setting unichar properties
> Other case É of é is not in unicharset
> Setting script properties
> Failed to read data from: training/langdata/eng/eng.config
> Null char=2
> Invalid format in radical table at line 4: 3400 1.4
> Creation of encoded unicharset failed!!
> Error writing recoder!!
> Reducing Trie to SquishedDawg
> Reducing Trie to SquishedDawg
> Reducing Trie to SquishedDawg
> Moving /tmp/tmp.GW5DOJr0rG/eng/eng.Times_New_Roman.exp0.lstmf to
> /home/zohreh/tesstutorial/engtrian
>
> Completed training for language 'eng'
> and i dont have eng.config my langdata . i clone langdata from git's
> tesseract
>
>
> On Saturday, August 5, 2017 at 5:50:59 PM UTC+4:30, shree wrote:
>>
>> ​tesseract -v
>> tesseract 4.00.00dev-594-g044e06e-2085
>>  leptonica-1.74.4
>>   libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib
>> 1.2.8
>>
>>  Found AVX
>>  Found SSE
>>
>>
>> The above version is working ok on linux
>>
>>  nice lstmtraining \
>>    --old_traineddata ../tessdata/best/san.traineddata \
>>   --continue_from ../tessdata/best/san.lstm \
>>    --traineddata ../tesstutorial/vedic/san/san.traineddata  \
>>    --train_listfile ../tesstutorial/vedic/san.training_files.txt \
>>    --eval_listfile ../tesstutorial/vedic/san.eval_files.txt \
>>   --model_output ../tesstutorial/vedic/santune \
>>   --max_iterations 200 \
>>    --debug_interval 0
>>
>> Loaded file ../tessdata/best/san.lstm, unpacking...
>> Warning: LSTMTrainer deserialized an LSTMRecognizer!
>> Code range changed from 145 to 2308!!
>> Num (Extended) outputs,weights in Series:
>>   1,36,0,1:1, 0
>> Num (Extended) outputs,weights in Series:
>>   C3,3:9, 0
>>   Ft16:16, 160
>> Total weights = 160
>>   [C3,3Ft16]:16, 160
>>   Mp3,3:16, 0
>>   Lfys48:48, 12480
>>   Lfx96:96, 55680
>>   Lrx96:96, 74112
>>   Lfx192:192, 221952
>>   Fc2308:2308, 445444
>> Total weights = 809828
>> Previous null char=2 mapped to 2
>> Continuing from ../tessdata/best/san.lstm
>> Loaded 138/138 pages (1-138) of document ../tesstutorial/vedic/san.AA_N
>> AGARI_SHREE_L3.exp0.lstmf
>> Loaded 138/138 pages (1-138) of document ../tesstutorial/vedic/san.AA_N
>> AGARI_SHREE_L3.exp-1.lstmf
>> Loaded 138/138 pages (1-138) of document ../tesstutorial/vedic/san.Adob
>> e_Devanagari.exp-2.lstmf
>> Loaded 138/138 pages (1-138) of document ../tesstutorial/vedic/san.Adob
>> e_Devanagari.exp1.lstmf
>>
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Sat, Aug 5, 2017 at 6:43 PM, ShreeDevi Kumar <shree...@gmail.com>
>> wrote:
>>
>>> did you build the training tools again?
>>>
>>>
>>> ShreeDevi
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>> On Sat, Aug 5, 2017 at 6:37 PM, Ava Nimaee <beigy....@gmail.com> wrote:
>>>
>>>> yes, you said me and i clone last tesseract-master and insatll it and
>>>> leptoica again and make tiff and box file and unicharest and then use this
>>>> syntax:
>>>> training/tesstrain.sh \
>>>>   --fonts_dir /usr/share/fonts \
>>>>   --lang eng  \
>>>>   --training_text langdata/eng/eng.training_text \
>>>>   --linedata_only \
>>>>   --noextract_font_properties  --langdata_dir langdata \
>>>>   --tessdata_dir ./tessdata \
>>>>   --fontlist "Times New Roman," \
>>>>   --output_dir tesstutorial/engtrian
>>>> ------------------------------------------------------------
>>>> training/tesstrain.sh \
>>>>   --fonts_dir /usr/share/fonts \
>>>>   --lang eng  \
>>>>   --training_text langdata/eng/eng.training_text \
>>>>   --linedata_only \
>>>>   --noextract_font_properties  --langdata_dir langdata \
>>>>   --tessdata_dir ./tessdata \
>>>>   --output_dir tesstutorial/engeval
>>>> and finally i use the last code that i said took error.
>>>> and for last syntax i put langdata/eng on folder of engtrian
>>>>
>>>>
>>>> On Saturday, August 5, 2017 at 5:28:48 PM UTC+4:30, shree wrote:
>>>>>
>>>>> Are you using the latest source of programs from github for building
>>>>> tesseract?
>>>>>
>>>>> ShreeDevi
>>>>> ____________________________________________________________
>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>
>>>>> On Sat, Aug 5, 2017 at 6:21 PM, Ava Nimaee <beigy....@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi
>>>>>> i used this syntax:
>>>>>>
>>>>>> training/lstmtraining --debug_interval 100 \
>>>>>>   --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
>>>>>>   --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 
>>>>>> O1c111]' \
>>>>>>   --model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \
>>>>>>   --train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \
>>>>>>   --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \
>>>>>>   --max_iterations 5000 &>~/tesstutorial/engoutput/basetrain.log
>>>>>>
>>>>>> and put eng.traineddata on right path but has an error:
>>>>>>
>>>>>> ERROR: Non-existent flag --traineddata
>>>>>>
>>>>>> can you help me?
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to tesseract-oc...@googlegroups.com.
>>>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/30f1bf28-ea1
>>>>>> 5-4999-b9ca-bccfed2be66f%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/30f1bf28-ea15-4999-b9ca-bccfed2be66f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to tesseract-oc...@googlegroups.com.
>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/tesseract-ocr/a9e00cdf-64d2-4cfe-9ff8-de931c34d798%40goo
>>>> glegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a9e00cdf-64d2-4cfe-9ff8-de931c34d798%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/633854bd-d3f3-4340-943d-9c9b062e2a62%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/633854bd-d3f3-4340-943d-9c9b062e2a62%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduU6fnRrKLEZLCnPrJ2oom%3DjxTju2d_7auo%3DA5Zokswpww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to