Hello,

I hope this is the right place to provide some feedback about
Tesseract, otherwise please redirect me.

I've tested the latest source code with two documents. The first one
was a letter with a seemingly odd font, there were a lot of scan
errors (in the 2.x version and in the 3.x version). It turned out that
in the 3.x version there were more errors. Unfortunately, I cannot
share this document.
For my second test, I took a book from Jules Verne. In the 2.x there
were quite a lot of errors, the latest source does provide a pretty
usable output.
My question is, if it would help you if I upload the image somewhere
and report what exactly went wrong? The problem is that with this font
every "ü" was recognized as "ii", so a reproducible bug.
The image is at:
  http://humenda.users.sourceforge.net/scan.tif
===Text===
1 gesättigt hatten, vor Sonnenaufgang wieder verschwinden wiirden. Dann
2 neuen Angriff zu schiitzen.
3 zu zeigen, begniigte sich Godfrey damit, ihm die Waffe aus der Hand zu
===
The correct versions would be: würden (1); schützen (2); begnügen (3).

There are other mistakes as well, but this is quite regular and it
seems to me that fixing this might be not that difficult.

Shall I provide more information?

Thanks
Sebastian
-- 
Test the free Latin-German dictionary | Teste das freie 
Latein-Deutsch-Wörterbuch!
Online: http://freedict.org/dict?Form=dict3&Database=lat-deu
More languages | mehr Sprachen: http://www.freedict.org

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to