I discovered that the problem is not with reading, but with exporting to pdf. As I have tried to save both readings as txt files and they are almost the same. So how can I make the export more like abby's? With the text precisely on the document, all aligned I mean ..
Il giorno mercoledì 25 marzo 2020 10:25:46 UTC+1, Teo ha scritto: > > Ok I think that it's a pdf generation module, because the txt is almost > the same with the exception of some "the" which tesseract sees as "thè". > > Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto: >> >> You need to know which to improve tesserct engine or PDF generation >> >> so compare text file from abby and tesserct >> if the result is highly different you need to improve image quality or >> improve LSTM >> >> if the result of tesseract is good so you need to enhance the PDF >> generation module >> >> بتاريخ الأربعاء، 25 مارس، 2020 7:04:14 ص UTC+2، كتب Teo: >>> >>> The quality is already very good, but is lower than abby finereader. In >>> attachment there is a comparison between abby and gimagereader ocr, and you >>> can see the difference. How we can improve it? >>> >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/06e4a583-3b9a-48e6-95ca-7591f77ad615%40googlegroups.com.