Thanks Dmitri! Stupid mistake by me. You made my day!

On Jun 16, 5:13 pm, Dmitri Silaev <[email protected]> wrote:
> When a command like
>
> combine_tessdata lang.
>
> is issued, the "combine_tessdata" utility simply searches for files
> having the name starting from "lang." and concatenates them into a
> single ".traineddata" file. Hence the small size in your case.
> Therefore you need to prefix the names of all your intermediate files
> with "med." and then try to run "combine_tessdata" again. Its size
> should be good. Mine was about 901K.
>
> Warm regards,
> Dmitri Silaevwww.CustomOCR.com
>
>
>
>
>
>
>
> On Thu, Jun 16, 2011 at 10:27 AM, Erik Reisig <[email protected]> wrote:
> > After that I run
>
> > Tesseract med.draft.tif med.arial nobatch box.train
>
> > For every tif/box pair. This creates a .tr for each pair.
>
> > I attached the .tr file for my specific font.
>
> > Then I run
>
> > Unichar-set_extractor med.arial.box med.draft.box …..
>
> > With each box file as an argument.
>
> > The creates the unicharset file I attached.
>
> > After that I run
>
> > Mftraining –F font_properties –U unicharset –O med.unicharset
> > med.arial.tr. med.draft.tr …
>
> > I attached the font-properties file and the mftraining output.
>
> > After that I run
>
> > Cntraining med.arial.tr med.draft.tr ….
>
> > Also attached the cntraining output files.
>
> > Since I currently don’t need any dictionary data, I don’t create any.
>
> > Then I run
>
> > combine_tessdata med.
>
> > which generates me the
>
> > med.traineddata file.
>
> > Unfortonately the file is only 2kb large and every attempt to
> > recognize test using this fails fails completely.
>
> > Could somebody please point out at which point in my process im making
> > a mistake?
>
> > Any help would be greatly appreciated!
>
> > Thanks,
>
> > Erik
>
> > Attachments:
>
> >http://dl.dropbox.com/u/686228/training_tesseract_for_draft.zip
>
> > --
> > You received this message because you are subscribed to the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to [email protected]
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> >http://groups.google.com/group/tesseract-ocr?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to