Re: [tesseract-ocr] ERROR: Non-existent flag --traineddata

Ava Nimaee Sat, 05 Aug 2017 10:27:17 -0700

thank for your attention
i remove all and install again last version tesseract and leptonica and use 
this syntax
training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng   
 --training_text training/langdata/eng/eng.training_text     
--linedata_only \
  --noextract_font_properties --langdata_dir training/langdata \
  --tessdata_dir ./tessdata \
  --fontlist "Times New Roman," --output_dir ~/tesstutorial/engtrian


but got a new error. all of things is ok but at the end took this:

Setting unichar properties
Other case É of é is not in unicharset
Setting script properties
Failed to read data from: training/langdata/eng/eng.config
Null char=2
Invalid format in radical table at line 4: 3400 1.4
Creation of encoded unicharset failed!!
Error writing recoder!!
Reducing Trie to SquishedDawg
Reducing Trie to SquishedDawg
Reducing Trie to SquishedDawg
Moving /tmp/tmp.GW5DOJr0rG/eng/eng.Times_New_Roman.exp0.lstmf to 
/home/zohreh/tesstutorial/engtrian

Completed training for language 'eng'
and i dont have eng.config my langdata . i clone langdata from git's 
tesseract


On Saturday, August 5, 2017 at 5:50:59 PM UTC+4:30, shree wrote:
>
> tesseract -v
> tesseract 4.00.00dev-594-g044e06e-2085
>  leptonica-1.74.4
>   libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib 
> 1.2.8
>
>  Found AVX
>  Found SSE
>
>
> The above version is working ok on linux
>
>  nice lstmtraining \
>    --old_traineddata ../tessdata/best/san.traineddata \
>   --continue_from ../tessdata/best/san.lstm \
>    --traineddata ../tesstutorial/vedic/san/san.traineddata  \
>    --train_listfile ../tesstutorial/vedic/san.training_files.txt \
>    --eval_listfile ../tesstutorial/vedic/san.eval_files.txt \
>   --model_output ../tesstutorial/vedic/santune \
>   --max_iterations 200 \
>    --debug_interval 0
>
> Loaded file ../tessdata/best/san.lstm, unpacking...
> Warning: LSTMTrainer deserialized an LSTMRecognizer!
> Code range changed from 145 to 2308!!
> Num (Extended) outputs,weights in Series:
>   1,36,0,1:1, 0
> Num (Extended) outputs,weights in Series:
>   C3,3:9, 0
>   Ft16:16, 160
> Total weights = 160
>   [C3,3Ft16]:16, 160
>   Mp3,3:16, 0
>   Lfys48:48, 12480
>   Lfx96:96, 55680
>   Lrx96:96, 74112
>   Lfx192:192, 221952
>   Fc2308:2308, 445444
> Total weights = 809828
> Previous null char=2 mapped to 2
> Continuing from ../tessdata/best/san.lstm
> Loaded 138/138 pages (1-138) of document 
> ../tesstutorial/vedic/san.AA_NAGARI_SHREE_L3.exp0.lstmf
> Loaded 138/138 pages (1-138) of document 
> ../tesstutorial/vedic/san.AA_NAGARI_SHREE_L3.exp-1.lstmf
> Loaded 138/138 pages (1-138) of document 
> ../tesstutorial/vedic/san.Adobe_Devanagari.exp-2.lstmf
> Loaded 138/138 pages (1-138) of document 
> ../tesstutorial/vedic/san.Adobe_Devanagari.exp1.lstmf
>
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Sat, Aug 5, 2017 at 6:43 PM, ShreeDevi Kumar <[email protected] 
> <javascript:>> wrote:
>
>> did you build the training tools again?
>>
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Sat, Aug 5, 2017 at 6:37 PM, Ava Nimaee <[email protected] 
>> <javascript:>> wrote:
>>
>>> yes, you said me and i clone last tesseract-master and insatll it and 
>>> leptoica again and make tiff and box file and unicharest and then use this 
>>> syntax:
>>> training/tesstrain.sh \
>>>   --fonts_dir /usr/share/fonts \
>>>   --lang eng  \
>>>   --training_text langdata/eng/eng.training_text \
>>>   --linedata_only \
>>>   --noextract_font_properties  --langdata_dir langdata \
>>>   --tessdata_dir ./tessdata \
>>>   --fontlist "Times New Roman," \
>>>   --output_dir tesstutorial/engtrian
>>> ------------------------------------------------------------
>>> training/tesstrain.sh \
>>>   --fonts_dir /usr/share/fonts \
>>>   --lang eng  \
>>>   --training_text langdata/eng/eng.training_text \
>>>   --linedata_only \
>>>   --noextract_font_properties  --langdata_dir langdata \
>>>   --tessdata_dir ./tessdata \
>>>   --output_dir tesstutorial/engeval
>>> and finally i use the last code that i said took error.
>>> and for last syntax i put langdata/eng on folder of engtrian
>>>
>>>
>>> On Saturday, August 5, 2017 at 5:28:48 PM UTC+4:30, shree wrote:
>>>>
>>>> Are you using the latest source of programs from github for building 
>>>> tesseract?
>>>>
>>>> ShreeDevi
>>>> ____________________________________________________________
>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>
>>>> On Sat, Aug 5, 2017 at 6:21 PM, Ava Nimaee <[email protected]> wrote:
>>>>
>>>>> Hi 
>>>>> i used this syntax:
>>>>>
>>>>> training/lstmtraining --debug_interval 100 \
>>>>>   --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
>>>>>   --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' 
>>>>> \
>>>>>   --model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \
>>>>>   --train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \
>>>>>   --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \
>>>>>   --max_iterations 5000 &>~/tesstutorial/engoutput/basetrain.log
>>>>>
>>>>> and put eng.traineddata on right path but has an error:
>>>>>
>>>>> ERROR: Non-existent flag --traineddata
>>>>>
>>>>> can you help me?
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/30f1bf28-ea15-4999-b9ca-bccfed2be66f%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/30f1bf28-ea15-4999-b9ca-bccfed2be66f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> To post to this group, send email to [email protected] 
>>> <javascript:>.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/a9e00cdf-64d2-4cfe-9ff8-de931c34d798%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/a9e00cdf-64d2-4cfe-9ff8-de931c34d798%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/633854bd-d3f3-4340-943d-9c9b062e2a62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] ERROR: Non-existent flag --traineddata

Reply via email to