It seems that when training we only have to input training_text, and then you train the training_text on different fonts. Tesseract will create images itself during training. And we don't have to give tesseract our image during training. Does this mean retrain will only help with fonts but not page layout? Meaning there's no way you can affect the way tesseract does the segmentation? (I understand that you can use --psm) I'm just wondering whether training will help you get better result for special layout, like a tabular image, with usual fonts.
On the other hand, it seems we can also create our own .box file and so the training. I guess I have the above idea just because I was drawing conclusion from fine tuning with a few characters. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d21049e1-19ae-4018-a40a-b4abbfa07bb8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

