[tesseract-ocr] Re: Funny results with vowels in Portuguese for Tesseract 4.0alpha

2017-11-29 Thread Quan Nguyen
Did you try the latest .traineddata versions -- fast or best?

https://github.com/tesseract-ocr/tesseract/wiki/Data-Files

On Friday, November 17, 2017 at 1:49:04 PM UTC-6, Marcello Galvão wrote:
>
> Hi, i have de same problem..
> Did you have any solution?
> Thank you!!
>
> Em quarta-feira, 16 de agosto de 2017 14:10:49 UTC-3, Paulo Scardine 
> escreveu:
>>
>> I have the following image:
>>
>>
>> 
>> For version 3.04 I get the correct result: "Declaração de Nascido Vivo".
>>
>> For 4.0 I get "Declªrªç㺠de Nªscidº Vivº".
>>
>> What I have tried so far:
>>
>>- everything on the Improving the Quality 
>> wiki 
>>article
>>- messing with `tessedit_char_whitelist` and `tessedit_char_blacklist`
>>- custom user word and pattern files
>>
>> Nothing made difference, I starting to think this may be a bug.
>>
>> I would appreciate advice on how to improve the diagnostic.
>>
>> Thanks in advance,
>> --
>> Paulo
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7f011e85-29b0-4b38-8013-2d1592c2155c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: Funny results with vowels in Portuguese for Tesseract 4.0alpha

2017-11-17 Thread Marcello Galvão
Hi, i have de same problem..
Did you have any solution?
Thank you!!

Em quarta-feira, 16 de agosto de 2017 14:10:49 UTC-3, Paulo Scardine 
escreveu:
>
> I have the following image:
>
>
> 
> For version 3.04 I get the correct result: "Declaração de Nascido Vivo".
>
> For 4.0 I get "Declªrªç㺠de Nªscidº Vivº".
>
> What I have tried so far:
>
>- everything on the Improving the Quality 
> wiki 
>article
>- messing with `tessedit_char_whitelist` and `tessedit_char_blacklist`
>- custom user word and pattern files
>
> Nothing made difference, I starting to think this may be a bug.
>
> I would appreciate advice on how to improve the diagnostic.
>
> Thanks in advance,
> --
> Paulo
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8140f1c6-92d7-44ab-ad80-178467e3493a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.