[tesseract-ocr] How are the "tessdata_best" models created?

Julian Gilbey Mon, 12 Aug 2019 14:28:07 -0700

I've been reading the wiki, and it says in the explanation of the training 
process (in the "Using tesstrain.sh" 
<https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#using-tesstrainsh>
 
section):
"For making a general-purpose LSTM-based OCR engine, it is woefully 
inadequate, but makes a good tutorial demo."


So my two questions are: (1) in what ways is this "woefully inadequate", 
and (2) how are the tessdata_best models made?  Is it just that they are 
trained for many more iterations in the same way, perhaps using more 
fonts?  Or is there more to it than that?

Thanks in advance!

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/cbe528a7-f53b-499b-b947-e7ae355f6cc1%40googlegroups.com.

[tesseract-ocr] How are the "tessdata_best" models created?

Reply via email to