It automates the process outlined in the Tesseract Training wiki <https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract>. Once you read through it, use of the tool is straight forward. You can practice with the sample source training files included.
On Tuesday, June 14, 2016 at 2:14:53 AM UTC-5, Rafał Błaczkowski wrote: > > Thank you for your answer. > But actually I don't know how to use jTessBoxEditor to train my OCR and to > receive .traineddata file... > Could you tell my how to use it? Or do you know where can I find some > tutorial for it? I couldn't find any... > > > W dniu wtorek, 14 czerwca 2016 04:32:02 UTC+2 użytkownik Quan Nguyen > napisał: >> >> Images appearing readable to human eyes may not be so to computers. >> Therefore, image processing is most likely required prior to OCR step. >> >> Sure, you can use jTessBoxEditor to train for your language. The >> generated .traineddata will be placed in a tessdata folder and you can >> use the *Validate *function to verify the resultant data. >> >> On Thursday, June 9, 2016 at 4:23:07 AM UTC-5, Rafał Błaczkowski wrote: >>> >>> Hello All!! >>> >>> I have a big problem with tesseract-ocr. >>> I downloaded the example of use tesseract from the official page >>> (net.sourceforge.tess4j.example) just for test how it works. >>> I downloaded too, almost all tessdata files (dunno what is the >>> difference between these files) and run the java script (using >>> net.sourceforge.tess4j). >>> I put very simple and easy tiff file for test, and results have not been >>> so well. Some words have been recognized correctly, but the rest've been >>> recognized like: BEST instead of DEST, DEF instead of DEP, etc. >>> >>> I understand, that I should train my script how to recognize my picture >>> (font, size, etc). But I dunno how to deal with it! Is there any >>> documentation about these problem? >>> I know that some files should be put in tessdata directory, but how to >>> create them? >>> >>> I downloaded also jTessBoxEditor, put some demo image with my text, >>> trained something in Trainer tab, but after training nothing have been >>> done... >>> >>> Can somebody help me or tell me how to solve my problems?? >>> >>> Many thanks for considering my request! >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8a3135c8-eea4-4ce8-bfff-42c825a1e256%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

