Ok coordinates seem correct. Il giorno giovedì 26 marzo 2020 19:13:52 UTC+1, Essam Zaky ha scritto: > > read this document > https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage > > the following command can return the coordinates > > tesseract testing/eurotext.png testing/eurotext-eng -l eng hocr > > > hocr contain the word as a text and coordinate > you can open the image in any image editor such as MSpaint and check the > returned coordinates represent the word in images > > Best Regards > > بتاريخ الخميس، 26 مارس، 2020 1:10:22 م UTC+2، كتب Teo: >> >> Thanks for your help. how can i get the coordinates, and how do i check >> if they are correct? >> >> Il giorno mercoledì 25 marzo 2020 10:41:07 UTC+1, Essam Zaky ha scritto: >>> >>> You need now to check the coordinates returned from tesseract ,use hocr >>> output and check if words coordinates are returned correctly if yes so it >>> is a bug in pdf generation >>> >>> if the coordinates are wrong it's bug in tesseract >>> >>> for me i used before library called itextsharp to generate searchable >>> pdf , the library ported from itext java library , it gives good pdf output >>> >>> >>> بتاريخ الأربعاء، 25 مارس، 2020 11:25:46 ص UTC+2، كتب Teo: >>>> >>>> Ok I think that it's a pdf generation module, because the txt is >>>> almost the same with the exception of some "the" which tesseract sees as >>>> "thè". >>>> >>>> Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto: >>>>> >>>>> You need to know which to improve tesserct engine or PDF generation >>>>> >>>>> so compare text file from abby and tesserct >>>>> if the result is highly different you need to improve image quality or >>>>> improve LSTM >>>>> >>>>> if the result of tesseract is good so you need to enhance the PDF >>>>> generation module >>>>> >>>>> بتاريخ الأربعاء، 25 مارس، 2020 7:04:14 ص UTC+2، كتب Teo: >>>>>> >>>>>> The quality is already very good, but is lower than abby finereader. >>>>>> In attachment there is a comparison between abby and gimagereader ocr, >>>>>> and >>>>>> you can see the difference. How we can improve it? >>>>>> >>>>>> >>>>>> >>>>>>
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6e127b74-c57f-4b79-94bd-e766d254f2cd%40googlegroups.com.
The main topics of theoretical computer science are taught in most computer Science and engineering curricula, but are not presented as a foundation for omputer studies. Most courses—and their reference textbooks—are highly sed in their choice of topics. Very often they overemphasize traditional areas such as formal languages and automata—and pay little or no attention to yer important topics—such as formal semantics or computational complexity. The organization of this book results from our strongly held belief that oretical computer science should be viewed as the cornerstone of computer ence and engineering curricula. Computer specialists, in their everyday life, must be able to translate actual problems into abstractions based on the use of ormal models, to manipulate such formal descriptions, and to reason about their _ Properties in a rigorous way. This very special attitude differentiates the com- puter specialist from most other technical professionals. For these reasons, we suggest that an exposure to theoretical computer science topics should be given in the early stage of computer science education, particularly at the undergraduate level. Theoretical topics should not be viewed as options that can be added late in the curricula. Rather, they must be viewed as ee Wa Viieel oc