Hi, I'll try another shot: When I move from tesseract 3.01 to tesseract 3.02 should I retrain my fonts with the 3.02 training tools or does this not matter? Best regards, Marcus
On Thursday, September 20, 2012 4:31:50 PM UTC+2, Speedy wrote: > Hi there, > > we are currently using tesseract 3.01 as OCR engine and have trained a > number of fonts with it. Things work quite well, but we would like to move > to version 3.02 for two reasons: > > - It is possible to combine fonts > - The character recognition is supposed to be significantly improved > > In our tests we found that the character recognition has chenged, but the > results are mixed. In particular, quite a few characters that previously > had few confusions now have none (which is good), but then there are > characters that are much worse, making the overall result worse. For > example, in one dataset the number of confusions from H to M has increased > from 7 to 52 and the number of confusions from O to D has increased from 15 > to 37. > > Is there a difference in the font files between tesseract 3.01 and 3.02? > Does it matter to tesseract 3.02 whether a font was trained with 3.01 > training? Would it help to retrain the fonts with tesseract 3.02 training > tools or should this not matter? > > In what way was character recognition improved in tesseract 3.02? > > Thanks in advance for any help you can provide! > > Best regards, > Marcus > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

