Re: Ocropus dictionary and tesseract

Thomas Breuel Tue, 21 Oct 2008 09:02:37 -0700

OpenFST is not used yet for post-processing Tesseract output; it's only used
in conjunction with the neural network recognizer.


The reason is that Tesseract doesn't output probabilities, so that means
that something special needs to be done in order to use language models with
Tesseract.

We're trying to get something along those lines into the next release.

Tom

On Mon, Oct 20, 2008 at 10:13, Dieselkopf <[EMAIL PROTECTED]> wrote:

>
> Hi there.
>
> I'm new to Ocropus and currently playing around with it. I thought it
> was using a dictionary and OpenFST as a language model. However,
> Ocropus sometimes reads a "c" for and "e". It recognises "thc" instead
> of "the" and "largc" instead of "large". Shouldn't the dictionary take
> care of such ambiguous characters? Also, if I run tesseract on the
> same text the character is being recognised correctly. Can someone
> explain to me what's going on here? Thanks.
>
> Chris
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Ocropus dictionary and tesseract

Reply via email to