I have a document that contains German words that have the ü character (u+umlaut), if I OCR this document using the "der" dictionary, it successfully OCRs those words, and if I OCR the document using the "eng" dictionary, it gets them wrong as expected (Gefühl -> Gefiihl, Dörfer -> Derfer, schützen -> schiitzen).
So as a test of the "user-words" facility I created a eng.user-words (attached) that contained a few German words. When I do the OCR, it still gets those words wrong. Is this proof that I'm creating the user-words wrong? Thanks, Chris -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
eng.user-words
Description: Binary data

