<https://lh3.googleusercontent.com/--vOEZ3ZFC28/VgEloARdM9I/AAAAAAAACp4/pyyHe2mSekQ/s1600/testimages.png>
Hello All,

    I am trying to extract characters from a PNG image using tesseract. The 
attached image is a screenshot of a program written in VS2012. Next, I am 
cropping the code editor section and saving it . I am using the tesseract 
from command prompt, along with the makebox parameter so as to retrieve the 
individual character bounding box dimension. The output which I am getting 
is as below.
 
#    startcolumn      startrow  endcolumn  endrow

 However, the desired output is given below.

3  startcolumn      startrow  endcolumn  endrow
#    startcolumn      startrow  endcolumn  endrow

I have tried to change the font in VS2012 and also tried by saving the 
screenshot in TIFF format. Still the problem persists. Tesseract is not 
able to detect the line numbers and all the characters correctly. Is it due 
to cropping of the image file reducing pixel depth? If so then how to 
increase it so that all characters are extracted correctly.


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/80f09ba8-c7cc-4001-8fb6-f8a987e6e44a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to