Hello Shree, I tried that. The command was
lstmtraining --traineddata data/akk/akk.traineddata --old_traineddata /usr/share/tesseract-ocr/4.00/tessdata/akk-1m.traineddata --continue_from data/akk-1m/akk.lstm --model_output data/akk/checkpoints/akk --train_listfile data/akk/list.train --eval_listfile data/akk/list.eval --max_iterations 1000 --debug_level -1 and the output started with Loaded file data/akk/checkpoints/akk_checkpoint, unpacking... Successfully restored trainer from data/akk/checkpoints/akk_checkpoint Loaded 1/1 pages (1-1) of document data/akk-ground-truth/P336598.000347.CuneiformComposite.exp0.lstmf Loaded 1/1 pages (1-1) of document data/akk-ground-truth/P238121.000012.CuneiformNAOutline_Medium.exp0.lstmf and ended with Loaded 1/1 pages (1-1) of document data/akk-ground-truth/Q005388.000005.Segoe_UI_Historic.exp0.lstmf At iteration 4716762/4760600/4760600, Mean rms=1.436%, delta=8.366%, char train=105.86%, word train=86.31%, skip ratio=0%, wrote checkpoint. Finished! Error rate = 88.246 Do I have have to retrain completely from scratch, meaning without loading the previous checkpoint? Maybe I should check out another approach from yours and try to train with one font excluded, so the LSTM converges. Another thought: I tried training Akkadian with Tesseract 4 once before, but with ground truth consisting of short text files with multiple lines of text, not one-liners. Obviously I used PSM 6, not PSM 11. Is there anything wrong with this approach? Am Montag, 17. Februar 2020 08:23:38 UTC+1 schrieb shree: > > Try lstmtraining again for 1000 iterations with --debug_level -1 > > > > > On Mon, Feb 17, 2020, 01:46 Wincent Balin <[email protected] > <javascript:>> wrote: > >> Hello all, >> >> after preparing ground truth files for Akkadian language, I started the >> training using the *tesstrain *Makefile, but over 4000000 iterations >> later, the output is like following: >> >> At iteration 4437804/4478900/4478900, Mean rms=1.453%, delta=9.455%, char >> train=121.423%, word train=87.461%, skip ratio=0%, wrote checkpoint. >> >> Does char train=121% mean CER of 121%? What could be the cause for such >> high values even after over 10 days of training? >> >> Yours truly, >> >> Wincent >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/79acb8ca-cb51-4e23-8853-ca4b3405a718%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/79acb8ca-cb51-4e23-8853-ca4b3405a718%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c5ccc3c8-f18f-4540-93e8-b55ffb37c3ac%40googlegroups.com.

