Trying to run tesseract to output data in a compatible format for a database

Hermes Thu, 18 Apr 2013 20:05:00 -0700

I am trying to setup tesseract to scan images that contain logos, numbers, 
and dates, words, etc. The works pretty much. After it is finished OCR'ing 
said image, I want it to output the data into a line with delimiters and 
stuff so it can easily be imported into a database.


At the moment I have gotten to this command line (very simple nothing too 
advanced):
tesseract oqsnoc.png -l eng -psm 7 

However, the config files are what is really screwing me up. I can't figure 
out if there is anything special I should put in mine, since well, there 
are tons of "samples".

Is there a good reading or a general good "default" config file to use?

I may have some further posts as I tend to go along with this, once I get 
the config file sorted, I am pretty sure I can handle getting it in to the 
database.

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Trying to run tesseract to output data in a compatible format for a database

Reply via email to