Re: Tess v3 not recognising accented Esperanto characters.

Sven Pedersen Mon, 10 Sep 2012 05:57:16 -0700

Hi Donaldo,
Yes, unless you can find the training data you will need to start from
scratch. But you dont need to print. There are training utilities listed on
the site to help generate images and box files.
Sven


On Sunday, September 9, 2012, Donaldo wrote:

> Thank you, Sven.
>>
>>
>>  I used combine_tessdata -e to extract the component files out of
>> epo.traineddata. I found that epo.unicharset has none of the Esperanto
>> accented characters, although it has many other accented letters.
>> epo.unicharambigs has ĉ ŭ Ŝ ĵ Ĵ, but because they are not in
>> epo.unicharset, those characters won't be used. I don't have the box files
>> etc, and I don't even know what fonts were used. Does this mean that I
>> should start from the beginning? I have a text file that I used for version
>> 2 of Tesseract a couple for years ago, so I could print it in several fonts
>> and scan them, etc.
>>
>>
>>  Donaldo
>>
>  --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to 
> [email protected]<javascript:_e({}, 'cvml', 
> '[email protected]');>
> To unsubscribe from this group, send email to
> [email protected] <javascript:_e({}, 'cvml',
> 'tesseract-ocr%[email protected]');>
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>


-- 
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
>From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: Tess v3 not recognising accented Esperanto characters.

Reply via email to