Hello,

I am attempting to have Tesseract find a number on a parts diagram. I then 
will take the number found and its coordinates for further programmatic 
usage. 

I am having no problem getting the engine to see strings of characters and 
from examination it is returning all of the correct text for the text that 
it finds.

I have a folder structure of .\tessdata\configs in which I copied one of 
the config files located in the Tesseract installs configs folder. I edited 
the file to contain the following two lines:

tessedit_char_whitelist 0123456789-.
tessedit_char_blacklist abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

I am then referencing the config file and it is being opened properly 
without complaint. It seems however that the configuration has no effect.

<https://lh3.googleusercontent.com/-Kmgl6SuZG9c/VXhqGUiyCCI/AAAAAAAAAKA/KG1zlBWXU5s/s1600/Tesseract%2BOutput.PNG>

<https://lh3.googleusercontent.com/-x7Z-lF0CoEg/VXhqJRfg6MI/AAAAAAAAAKI/WRNo2T24BJ0/s1600/img1.TIF>




I say this because as you can see I am still getting all of the characters 
in my blacklist returned.


Once I get the configuration to take affect I need to know what 
configuration is needed to find the single numbers on a parts list per this 
example.

<https://lh3.googleusercontent.com/-pPt4DndW9MA/VXhpIXPryLI/AAAAAAAAAJw/GFFE4oxRHVg/s1600/95bca-1.tif>

Thanks in advance for any assistance that can be provided on this.


V/R


James



-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/51fc9dc8-c275-4ebf-8f4b-460c8b5d48d2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to