Hello everyone, I'm experimenting with handwriting recognition using Tesseract 4.0. More concrete, I want to train Tesseract to recognize one particular Russian handwriting. So, I wanted to add the "new font" (based on a bunch of tiff-images, which are a part of scanned archive, and box files) to already existing rus.traineddata using fine tuning. I've prepared tiff/box pairs and then tried this script:
training/lstmtraining --model_output /.../rus_new/ --continue_from > /.../rus.lstm --train_listfile /.../list_of_files.txt --eval_listfile > /.../list_of_files.txt --max_iterations 5000 Where "list_of_files.txt' looked like: /.../rus.Eskal_Font4You.exp0.tif > /.../rus.Eskal_Font4You.exp0.box ...and it ended up with this error: > First document cannot be empty!! > num_pages_per_doc_ > 0:Error:Assert failed:in file imagedata.cpp, line 655 What I am missing? Thanks in advance. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0ca72c72-aa9a-4412-89a2-5b03b0446a7d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

