Re: Compiling under Linux

Thomas Breuel Wed, 12 Aug 2009 06:53:16 -0700

> On a slightly different topic. I'm trying to create some training
> data, but
> am unsure how to go about creating the 'cseg' files, is that something
> ocropus will
> do for me (how ?) or is it done manually. If the latter how are they
> created ?.


It depends on what you're training on.  For an entirely new script,
you need to create them by hand, but that's rare.  You can also
artificially generate training data from fonts and generate the cseg
information automatically along with that.

The usual thing is that you have some existing OCR results, probably
with character bounding boxes.  There are some OCRopus functions to
convert bounding boxes into cseg files.  Those can then be used for
training.

Once you have trained OCRopus, then the usual way of creating the cseg
files is with "ocropus align"; this will align text lines with their
corresponding transcriptions.

Cseg files are also generated during regular recognition and those can
be used for training as well (this is the usual way for book-adaptive
training).

Have a look at the "Training" wiki page.

Tom

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Compiling under Linux

Reply via email to