Hello, I hope this is the right place to provide some feedback about Tesseract, otherwise please redirect me.
I've tested the latest source code with two documents. The first one was a letter with a seemingly odd font, there were a lot of scan errors (in the 2.x version and in the 3.x version). It turned out that in the 3.x version there were more errors. Unfortunately, I cannot share this document. For my second test, I took a book from Jules Verne. In the 2.x there were quite a lot of errors, the latest source does provide a pretty usable output. My question is, if it would help you if I upload the image somewhere and report what exactly went wrong? The problem is that with this font every "ü" was recognized as "ii", so a reproducible bug. The image is at: http://humenda.users.sourceforge.net/scan.tif ===Text=== 1 gesättigt hatten, vor Sonnenaufgang wieder verschwinden wiirden. Dann 2 neuen Angriff zu schiitzen. 3 zu zeigen, begniigte sich Godfrey damit, ihm die Waffe aus der Hand zu === The correct versions would be: würden (1); schützen (2); begnügen (3). There are other mistakes as well, but this is quite regular and it seems to me that fixing this might be not that difficult. Shall I provide more information? Thanks Sebastian -- Test the free Latin-German dictionary | Teste das freie Latein-Deutsch-Wörterbuch! Online: http://freedict.org/dict?Form=dict3&Database=lat-deu More languages | mehr Sprachen: http://www.freedict.org -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

