Dear all, I'm trying to train tesseract for recognition of a dotted font such as this image.
<https://lh3.googleusercontent.com/-k1_neF5ZyQw/V2JkBs_4HMI/AAAAAAAAAyI/r_fpKJTN4TwQjPcxyNkg6rts4bAwHGriACLcB/s1600/eng_dotmatrix.dot-matrix.exp0.bmp> Here is my tif/box file pair that is generated by jTessBoxEditer. eng_dotmatrix.dot-matrix.exp0.tif <https://drive.google.com/open?id=0B2tu51tmJ0FvdGt2dW93cnR5d00> eng_dotmatrix.dot-matrix.exp0.box <https://drive.google.com/open?id=0B2tu51tmJ0FvenJvR3RqWElqaHM> (I want to train tesseract for this font as a new language only for uppercase and digits.) Then I ran: tesseract eng_dotmatrix.dot-matrix.exp0.tif eng_dotmatrix.dot-matrix.exp0 box.train output was only: Tesseract Open Source OCR Engine v3.02 with Leptonica and tesseract did not generate .tr file. Can't I train tesseract for fonts that have too much small blobs in one character? I think I can make good blobs by eroding the image, but I don't want to manipulate the image. Do you have any suggestions? O/S: Windows 7 Tesseract Ver: 3.02.02 Regards, Lee. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/cda563e5-2755-42b3-8656-de18dc2684f4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

