Thanks Dmitri! Stupid mistake by me. You made my day! On Jun 16, 5:13 pm, Dmitri Silaev <[email protected]> wrote: > When a command like > > combine_tessdata lang. > > is issued, the "combine_tessdata" utility simply searches for files > having the name starting from "lang." and concatenates them into a > single ".traineddata" file. Hence the small size in your case. > Therefore you need to prefix the names of all your intermediate files > with "med." and then try to run "combine_tessdata" again. Its size > should be good. Mine was about 901K. > > Warm regards, > Dmitri Silaevwww.CustomOCR.com > > > > > > > > On Thu, Jun 16, 2011 at 10:27 AM, Erik Reisig <[email protected]> wrote: > > After that I run > > > Tesseract med.draft.tif med.arial nobatch box.train > > > For every tif/box pair. This creates a .tr for each pair. > > > I attached the .tr file for my specific font. > > > Then I run > > > Unichar-set_extractor med.arial.box med.draft.box ….. > > > With each box file as an argument. > > > The creates the unicharset file I attached. > > > After that I run > > > Mftraining –F font_properties –U unicharset –O med.unicharset > > med.arial.tr. med.draft.tr … > > > I attached the font-properties file and the mftraining output. > > > After that I run > > > Cntraining med.arial.tr med.draft.tr …. > > > Also attached the cntraining output files. > > > Since I currently don’t need any dictionary data, I don’t create any. > > > Then I run > > > combine_tessdata med. > > > which generates me the > > > med.traineddata file. > > > Unfortonately the file is only 2kb large and every attempt to > > recognize test using this fails fails completely. > > > Could somebody please point out at which point in my process im making > > a mistake? > > > Any help would be greatly appreciated! > > > Thanks, > > > Erik > > > Attachments: > > >http://dl.dropbox.com/u/686228/training_tesseract_for_draft.zip > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > >http://groups.google.com/group/tesseract-ocr?hl=en
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

