[tesseract-ocr] Re: How to improve ocr reader?

Teo Wed, 25 Mar 2020 02:40:36 -0700

I discovered that the problem is not with reading, but with exporting to 
pdf. As I have tried to save both readings as txt files and they are almost 
the same. So how can I make the export more like abby's? With the text 
precisely on the document, all aligned I mean ..


Il giorno mercoledì 25 marzo 2020 10:25:46 UTC+1, Teo ha scritto:
>
> Ok I think that it's  a pdf generation module, because the txt is almost 
> the same with the exception of some "the" which tesseract sees as "thè".
>
> Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto:
>>
>> You need to know which to improve tesserct  engine or PDF generation
>>
>> so compare text file from abby and tesserct 
>> if the result is highly different you need to improve image quality or 
>> improve LSTM 
>>
>> if the result of tesseract is good so you need to enhance the PDF 
>> generation module
>>
>> بتاريخ الأربعاء، 25 مارس، 2020 7:04:14 ص UTC+2، كتب Teo:
>>>
>>> The quality is already very good, but is lower than abby finereader. In 
>>> attachment there is a comparison between abby and gimagereader ocr, and you 
>>> can see the difference. How we can improve it?
>>>
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/06e4a583-3b9a-48e6-95ca-7591f77ad615%40googlegroups.com.

[tesseract-ocr] Re: How to improve ocr reader?

Reply via email to