Re: Language Files

georg Tue, 17 Sep 2013 18:06:26 -0700

Hi Tom,

Thanks for your reply!


Yes, we only read digits.

We use open CV for filtering, but not for the actual recognition. I was not 
aware of open CV's recognition abilities.

What do you mean with the upstream pipeline? - Increase the images and sort 
out  the whole digit? - Unfortunately we are not able to do that.

Thanks

Georg

Am Montag, 26. August 2013 13:43:43 UTC+2 schrieb georg:
>
> Hello,
>
> I have a question regarding language files.
>
> We have a set of characters, which sometimes has cut off characters.
>
> It is my understanding that I can not train very different looking 
> characters in one set, because it causes tesseract to get confused.
>
> I would like to generate 2 tiffs (one for complete characters and one for 
> cut off ones) and then do the mft training.
>
> Is it true that mft training assembles both tiffs in one language file and 
> runs tesseract twice, first with the tiff for the whole characters and once 
> for the cut off characters?
>
> Does tesseract keep the tiffs separate although they are in the same 
> language file?
>
> How would you work this problem? - I want to try and keep the training 
> process as simple as possible (it is already complicated enough).
>
> Thanks for your help!
>
> Take care,
>
> Georg
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Language Files

Reply via email to