I'm using Tesseract 3.04 (953523b) with Tess4J API in a Java application. Since the upgrade to Tesseract 3.04 (953523b) I'm getting phantom characters with the current german "deu.traineddata".
For example: The word "Marineverband" is recognised in different Versions: -> Marine_a_verband -> Marine_e_verband -> Marine_verband -> Marine_ayerband screenshot added with source data I get a lot of these phantom characters, I'll add different examples in the following days. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d7449453-24da-460f-b76b-e8e35e829f76%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

