Hi, I am confused about training image files. I am trying to create Fraktur German language data files for Tesseract which are more accurate... but I need help to first understand what the software requirements are ( this does not seem to be properly addressed in the notes, at least not to my level of satisfaction).
Does an image training file have to have a complementary text file with a closest matching font to the script in the image file ???? This is where I get stumped.... I don't understand what the software requirements are and I notice that some people mention getting particular results which are better when they do certain things like beginning a line with a capital etc which seems to add to my frustration of not knowing what the software is requiring or wanting. Any help on this matter to weed out any misconceptions I may or may not have will be much appreciated. I would then like to make a better help file for training Tesseract. Cheers Richard On Tuesday, May 1, 2012 9:22:06 PM UTC+10, Rufus wrote: > > I've trained a new new language consisting of digits and a few special > characters in the font swiss921bt. I've been wondering whether my training > text is good enough, or should I use the standard training text ( > http://michaeljaylissner.com/files/standard-training-text.txt) ? > > my training text in swiss921bt font is in the attachement. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

