<https://stackoverflow.com/posts/79803449/timeline>

I need to train the default eng data, so that it can also recognize new 
characters. I created box files and lstm files and when running cmd:
lstmtraining \
  --model_output output/eng_latin \
  --continue_from "/c/Program Files/Tesseract-OCR/ tessdata/eng.lstm" \
  --append_index 5 \
  --net_spec "[Lfx192 O1c129]" \
  --traineddata "/c/Program Files/Tesseract-OCR/tessdata/eng.traineddata" \
  --train_listfile training/training_files.txt \
  --max_iterations 400 

getting error
Loaded file C:/Program Files/Tesseract-OCR/tessdata/eng.lstm,
unpacking... Warning: LSTMTrainer deserialized an LSTMRecognizer!
Continuing from C:/Program Files/Tesseract-OCR/tessdata/eng.lstm
Appending a new network to an old one!!Warning: given outputs 129 not equal 
to unicharset of 111.
Num outputs,weights in Series: Lfx192:192, 221952 Fc111:111, 21423 Total 
weights = 243375 Built 
network:[1,36,0,1[C3,3Ft16]Mp3,3TxyLfys64Lfx96RxLrx96Lfx192Fc111] from 
request [Lfx192 O1c129]
Training parameters: Debug interval = 0, weights = 0.1, learning rate = 
0.001, momentum=0.5 null char=110
Deserialize header failed: 1.lstmf
Deserialize header failed: 2.lstmf
Deserialize header failed: 3.lstmf
Deserialize header failed: 4.lstmf
Deserialize header failed: 5.lstmf
Load of page 0 failed! Load of images failed!! 

Files data: https://wormhole.app/X6mPda#lT3aG2Jm9u2QquNRyIruMA
Note: I am on windows

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/dc3d67d2-f9a1-4271-a5be-e1da77e99c07n%40googlegroups.com.

<<attachment: training.zip>>

Reply via email to