1) I am not sure if this is possible. Please provide some simple test case for clear understanding: input image, output you have got, desired output... Maybe something can be done in post-processing of tesseract output...
2) I did not get you try to do with it. I your first example (tesseract oqsnoc.png -l eng -psm 7) you forget to enter output filename - it does not work. Than you wrote other command that you not make sense to me: you send ocr output to stout.txt, but you redirect stdout to test.txt there is not output at stdout... Do you understand what are you doing? 3) If there are better settings (config parameters) than default, we would set it as default ;-) Zdenko On Sat, Apr 20, 2013 at 12:23 AM, Hermes <[email protected]> wrote: > 1) Yeah, the best example is trying to OCR the image (with Adobe or > Foxit), it comes out like: > > line 1 > > line2 oddstuffhere line 3 > > > So I am looking into other options, and the first one I found that might > be able to perform this, was tesseract. > > What I would like to import is > > field1, field2, field3 > > > Though, it doesn't matter what or how it is formatted, as long as it > bencoded or can be imported into sqlite, that is what I am looking at. > > 2) If I try to use that command: > > server:~$ tesseract oqsnoc.png stout -l eng -psm 7 > test.txt > Tesseract Open Source OCR Engine v3.02.02 with Leptonica > server:~$ cat test.txt > server:~$ > > Unless I still need the config file, even after that. > > > > 3)Thought that I might get better results with a config, versus the using > default. > > On Friday, April 19, 2013 2:46:43 AM UTC-4, zdenop wrote: >> >> >> On Fri, Apr 19, 2013 at 4:52 AM, Hermes <[email protected]> wrote: >> >>> I am trying to setup tesseract to scan images that contain logos, >>> numbers, and dates, words, etc. The works pretty much. After it is finished >>> OCR'ing said image, I want it to output the data into a line with >>> delimiters and stuff so it can easily be imported into a database. >>> >>> Can you give us example what is your expected result (e.g. what you want >> to import to db)? >> >> >>> At the moment I have gotten to this command line (very simple nothing >>> too advanced): >>> tesseract oqsnoc.png -l eng -psm 7 >> >> >> should be: >> tesseract oqsnoc.png stdout -l eng -psm 7 >> right? >> >> >>> However, the config files are what is really screwing me up. I can't >>> figure out if there is anything special I should put in mine, since well, >>> there are tons of "samples". >>> >>> Is there a good reading or a general good "default" config file to use? >>> >>> Why do you want to use config? General good 'default' is already set >> up... >> >> >>> I may have some further posts as I tend to go along with this, once I >>> get the config file sorted, I am pretty sure I can handle getting it in to >>> the database. >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> >>> To unsubscribe from this group, send email to >>> tesseract-oc...@**googlegroups.com >>> >>> For more options, visit this group at >>> http://groups.google.com/**group/tesseract-ocr?hl=en<http://groups.google.com/group/tesseract-ocr?hl=en> >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@**googlegroups.com. >>> >>> For more options, visit >>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out> >>> . >>> >>> >>> >> >> -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

