I fixed the problem with the strange chars in the deu.unicharset. It
was a wrong configured locale setting. But the problem that no Umlaute
are recoognized is still not fixed.
On an another system with Ubuntu 8.04 I don't have this problem. The
svn Version and the packaged version do both recognize umlauts. Is
this a bug in the 2.03 version?
As the system with the umlaut problem is a productive system, I would
like to have a good chance that a new installation or build will fix
the problem before I shut down the server.

Thanks for your help.

On Mar 6, 1:44 pm, tobltobs <[email protected]> wrote:
> Hello
>
> I am running tesseract 2.03 on debian 4 (compiled, not the packaged
> version) from the command line.
> It is a great peace of software and runs quite well.
> But I have problems with european characters like the german umlauts.
>
> The only non ascii character which is recognized is the é . All other
> special caracters are not recognized. It doesn't make any difference
> which language I specify. The results with the eurotext test image are
> always the same.
>
> If I open deu.unicharset with nano I have a few lines which look
> strange, like
> ^ ^ 0> 0
>
> ö 3
> ¢ 0
> $ 0
> é 3
> ^ ^ 0
>
> The result of the tesseract the eurotext image is:
>
> The (quick) [brown] {fox} jumps!
> Over the $43,456.78 <lazy> #90 dog
> & duck/goose, as 12.5% of E-mail
> from [email protected] is spam.
> Der ,.schnelle” braune Fuchs springt
> iiber den faulen Hund. Le renard brun
> <<rapide» saute par·dessus le chien
> paresseux. La volpe marrone rapida
> salta sopra il cane pigro. El zorro
> marrén répido salta sobre el perro
>
> Does anybody has an idea where the problem is?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to