You can train on isolated characters using "ocropus trainseg"; it
requires the input images (of the form 0000/0001.png), corresponding
character segmentation files (of the form 0000/0001.cseg.gt.png) and
the output (of the form 0000/0001.gt.txt).  If you really have just
one character per input, the 0001.cseg.gt.png is just a binary version
of the grayscale image and the ground truth file contains only a
single character.  More commonly, you'd have many characters per line.

Alternatively, if you really want full programmatic control, you can
use any classifier (interface: IModel) and train it.  You can train it
either on the raw bitmap, or you can extract features with the
built-in feature extractor (interface: IFeatureMap), or with your own
feature extractor.

Look in linerec.cc in the addTrainingLine and recognizeLine methods
(although that contains a lot of segmentation-related code).

Tom


On Tue, May 26, 2009 at 20:38, Yaroslav Bulatov <[email protected]> wrote:
>
> I'd like to train ocropus to recognize isolated digits. Version 0.3
> had rec-bpnet-isolated Lua script, any suggestions where to start
> looking for similar functionality in 0.4?
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to