<https://lh3.googleusercontent.com/--vOEZ3ZFC28/VgEloARdM9I/AAAAAAAACp4/pyyHe2mSekQ/s1600/testimages.png>
Hello All,
I am trying to extract characters from a PNG image using tesseract. The
attached image is a screenshot of a program written in VS2012. Next, I am
cropping the code editor section and saving it . I am using the tesseract
from command prompt, along with the makebox parameter so as to retrieve the
individual character bounding box dimension. The output which I am getting
is as below.
# startcolumn startrow endcolumn endrow
However, the desired output is given below.
3 startcolumn startrow endcolumn endrow
# startcolumn startrow endcolumn endrow
I have tried to change the font in VS2012 and also tried by saving the
screenshot in TIFF format. Still the problem persists. Tesseract is not
able to detect the line numbers and all the characters correctly. Is it due
to cropping of the image file reducing pixel depth? If so then how to
increase it so that all characters are extracted correctly.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/80f09ba8-c7cc-4001-8fb6-f8a987e6e44a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.