You say that both letters looks the same (same height too?) and that it is not possible to do it in processing as both spellings are possible. How is tesseract, or a human, supposed to tell them apart?
Can you please share a sample? Maybe using a smaller/bigger image is enough. Or maybe the image is very noisy or colored or there is something else making the things more difficult. About the longer text you could try to repeat the same image twice, vertical or horizontal and see IF it helps. Lorenzo Il giorno gio 19 set 2019 alle ore 09:51 'Sandra M.' via tesseract-ocr < [email protected]> ha scritto: > thanks for your responses > @Timothy Snyder: I think I cannot do this in postprocesssing, as it is > possible that both spellings occur, but I have to differentiate them. Or > what did you do exactly? > @zdenop: Unfortunately it is not possible for me to send a longer text. > > anyone else any ideas? > > Am Mittwoch, 18. September 2019 17:19:22 UTC+2 schrieb Sandra M.: >> >> I'm using Tesseract with Python. I have an image with 1-6 words in it and >> need to read the text. Sometimes the character "C", which look the same in >> upper and lower case, is detected as lower case c instead of upper case C. >> I see the problem, but in context to the following letters it should be >> possible to detect the right notation. Is there any configuration or >> something to improve this? >> >> I had a look at the configuration options of config='-psm x' with >> different values for x, but nothing fits to my problem >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/c8271c49-77a3-4081-9418-0a822be1f8c7%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/c8271c49-77a3-4081-9418-0a822be1f8c7%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLwn6gbkqQwdZrbNeVgQLqeLBiiYN1wbOg%2B2f0_i3Upgnw%40mail.gmail.com.

