Thanks. I'll try upgrading and seeing if the issue is resolved. On Sat, Jan 11, 2020, 3:45 PM Zdenko Podobny <[email protected]> wrote:
> You use old tesseract version.... > > Dňa so 11. 1. 2020, 22:43 Matthew Getzin <[email protected]> > napísal(a): > >> Hello, >> >> I created an issue (see below) on Github. Not sure if it is a bug or >> something for discussion forum... >> >> ### Environment >> >> * **Tesseract Version**: tesseract 4.0.0-beta.1 >> leptonica-1.75.3 >> libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : >> libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 >> >> Found AVX2 >> Found AVX >> Found SSE >> >> * **Platform**: Linux getzinmw-XPS-15-9550 4.15.0-72-generic #81-Ubuntu >> SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux >> >> ### Current Behavior: >> I am currently having issues with the hOCR output from tesseract as >> compared to the default .txt output. In the attached image, for example, my >> hOCR output does not register the majority of the numbers on the left side >> of the page, while they are registered in the .txt output file. >> >> Commands tried: >> tesseract input.png output -l eng --psm 6 >> tesseract input.png output -l eng --psm 6 hocr >> >> ### Expected Behavior: >> I would expect that the recognition of text would be consistent between >> the two modes with the output format being the only difference. >> >> ### Suggested Fix: >> Ensuring consistent output from the various formats. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/62a1f36b-cfd2-4c4d-901d-337d6bbcc12d%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/62a1f36b-cfd2-4c4d-901d-337d6bbcc12d%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zTQAKnOxyTiXG2X9on9L%2BkbacMVN_CYXQDyzi9_7sVsA%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zTQAKnOxyTiXG2X9on9L%2BkbacMVN_CYXQDyzi9_7sVsA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CALhfSEZBCPMDCQdDhAfTta_-oRdRbQxTCeA%2BAUpiQjspfsdQaQ%40mail.gmail.com.

