> I've had a chance to play around with the system and look through the
> documentation;
> it looks like a very nice system so far. At this point, what I'd really like
> to know is:
> is there anyway to *output* a *lattice* of recognition hypotheses?
> Preferably with OCR match scores?
OCRopus always outputs lattices of recognition hypotheses; these are
then combined with language models to compute the final output.
If you run the ocropus-{binarize,pseg,linerec,langmod} sequence, the
recognition lattices for each text line are left in files like dir/
0001/010001.fst
If you use the ocropus-pages command, look at the Python source code
to see where the recognition lattice is computed. At the code level,
all line recognizers (IRecognizeLine) transform images of text lines
into recognition lattices.
> I'd be interested in either word, character, or segment lattices.
If you want a lattice of a word, you either give IRecognizeLine an
image of a word, or you need to take the lattice output by
IRecognizeLine, segment it into words using a transducer, and then
pick out the individual word lattices.
Cheers,
Thomas
--
You received this message because you are subscribed to the Google Groups
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ocropus?hl=en.