"C" is missing in the text because tesseract doesn't have enough margin to 
read the text. 
Require proper margin.


On Friday, June 29, 2018 at 12:39:06 PM UTC-4, Dattatraya Tembare wrote:
>
> Hello Hari,
> I faced the same problem. 
>
> When there are 2 different type of fonts, Tesseract doesn't recognize it 
> properly. It recognizes first text and ignores next text if the font size 
> is bigger than first one.
> I resolved it by cropping the image into 2 pieces. I'm using 
> ImageMagick (java api) to clean and crop the images.
>
> And I see you made a command unnecessarily complicated (I have tesseract 
> path set up) 
>
> C:\EA>tesseract Capture.PNG Capture -l eng
> Tesseract Open Source OCR Engine v4.0.0-alpha.20180109 with Leptonica
>
> C:\EA>tesseract Capture1.PNG Capture1 -l eng
> Tesseract Open Source OCR Engine v4.0.0-alpha.20180109 with Leptonica
>
> Tesseract will return proper text if the text is at center, how I 
> achieved is -- crop, trim added a border 
>
> Datta
>
> On Thu, Jun 28, 2018 at 3:33 PM Hari P <[email protected]> wrote:
>
>> I am using tesseract v4.0 beta 1 and trying to OCR remittance file. There 
>> is one section which has CHECK NO, but tesseract doesn't seem to recognize 
>> it at all.
>>
>> I have tried with setting dictionary words and penalties to 1 for non 
>> dictionary words, yet no change.
>>
>> tesseract capture.png captureoutput1 --user-words "C:\Program Files 
>> (x86)\Tesseract-OCR\tessdata\eng.user-words" -c load_system_dawg=0 -c 
>> load_freq_dawg=0 -c language_model_penalty_non_dict_word=1 -c 
>> language_model_penalty_non_freq_dict_word=1
>>
>> These are the words I have in eng.user-words.
>>
>> CHECK NO.
>> CHECK
>> NO
>> check
>> no
>>
>> Any idea how to fix this?
>>
>> Thanks,
>> Hari
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/01ef5e64-3332-4b0f-a0aa-8ab9488083f1%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/01ef5e64-3332-4b0f-a0aa-8ab9488083f1%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
> Best Regards,
> Dattatraya Tembare
> +1 914 721 6311
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a883cbb9-a96c-4744-b29f-7335c99b813c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to