After that I run Tesseract med.draft.tif med.arial nobatch box.train
For every tif/box pair. This creates a .tr for each pair. I attached the .tr file for my specific font. Then I run Unichar-set_extractor med.arial.box med.draft.box ….. With each box file as an argument. The creates the unicharset file I attached. After that I run Mftraining –F font_properties –U unicharset –O med.unicharset med.arial.tr. med.draft.tr … I attached the font-properties file and the mftraining output. After that I run Cntraining med.arial.tr med.draft.tr …. Also attached the cntraining output files. Since I currently don’t need any dictionary data, I don’t create any. Then I run combine_tessdata med. which generates me the med.traineddata file. Unfortonately the file is only 2kb large and every attempt to recognize test using this fails fails completely. Could somebody please point out at which point in my process im making a mistake? Any help would be greatly appreciated! Thanks, Erik Attachments: http://dl.dropbox.com/u/686228/training_tesseract_for_draft.zip -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

