Re: Limited output

Thomas Breuel Fri, 29 May 2009 05:09:53 -0700

> I tried to build a FST for my recognition.
> I have still a lot of doubts how OCRopus use the FST.
>
> 1) If I have to recognize A,  should the input label be just A?


Yes, both the input and output weights should be unicode codepoints,
and usually, they should be the same codepoint or (OpenFST) epsilon.

> 2) the weight is defined in base a which principle?

The weight should be a negative log probability.  If you just want to
represent existence, set the weight to 0.

> Anyway I notice that all the final weights, in the default.fst file, are
> really big like 1.99999994e+38.

This means that there was no path.  (Ilya has been meaning to fix the
error message.)

> The results of other tests were: A empty string or a strange character
> (Little square with 4 litter numbers inside 2x2)

You probably didn't set the output symbol to anything meaningful;
OCRopus only ever produces output symbols from your language model.
If you get a weird Unicode character in the output (little square with
four numbers), that means that you put such a Unicode codepoint on
your output.

A good way of figuring out what's going on is to visualize your
language model with GraphViz.  I think OpenFST contains a tools.  You
can visualize the recognition process itself with "ocropus recognize1
line.png" (you must set some environment variables for this to work
right; see the doc you get when you type "ocropus recognize1").  The
source code for recognize1 also contains visualization code that you
can reuse.

Tom

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Limited output

Reply via email to