I have the following image:

<https://lh3.googleusercontent.com/-swcbZqlCZFY/WZR3d0zabzI/AAAAAAAALfc/nbhbpyQ60TggekPndQrO30bJmtG7RdSJACLcBGAs/s1600/tag.png>
For version 3.04 I get the correct result: "Declaração de Nascido Vivo".

For 4.0 I get "Declªrªç㺠de Nªscidº Vivº".

What I have tried so far:

   - everything on the Improving the Quality 
   <https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality> wiki 
   article
   - messing with `tessedit_char_whitelist` and `tessedit_char_blacklist`
   - custom user word and pattern files

Nothing made difference, I starting to think this may be a bug.

I would appreciate advice on how to improve the diagnostic.

Thanks in advance,
--
Paulo

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/fd60e50b-c874-4b9a-aa4c-1a1272e1882f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to