So details for training are split to 2 wikis: <http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract> http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract2 http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Unfortunately comments (now irrelevant) stay on http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract ;-) Zd. Dňa 05.09.2010 12:01, zdenko podobny wrote / napísal(a): > Hello, > > Tesseract 2.04 do not use "combined" file, so there is no combine_tessdata. > Just copy your files to tessdata directory. > > At the moment http://code.google.com/p/tesseract-ocr/wiki/TestingTesseract > describe > training for Tesseract 3.0 (with mistakes ;-) - I started to check it so > soon there will be correct version). If you want to see description > for Tesseract 2.04 look at svn repository > http://code.google.com/p/tesseract-ocr/source/browse/wiki/TrainingTesseract.wiki?r=318. > It is in wiki syntax but it is easy readable. > > BR, > > Zd. > > On Sat, Sep 4, 2010 at 5:15 AM, John Smith <[email protected]> wrote: > >> Hi, >> >> Thank you so much for the reply. >> I just have one more step to make, I am using Tesseract 2.04 now and I've >> got all the files ready, I am trying to combine them all together but there >> is no combine_tessdata for 2.04, I want to know how to combine them under >> 2.04. >> >> Thank you so much!! >> >> >> On Sun, Aug 29, 2010 at 8:30 PM, Jimmy O'Regan <[email protected]> wrote: >> >>> On 28 August 2010 07:45, OCR Newbie <[email protected]> wrote: >>>> Hi All, >>>> >>>> Currently I am trying to use Tesseract(2.04) to recognize my own data, >>>> with Mac OS X Snow Leopard. >>>> I find this >>> http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract >>>> and I am trying to follow this tutorial. >>>> My questions are: >>>> 1. I already have my train.tif ready, but I am not sure where I should >>>> place the image file, (under 'tessdata' folder or can be anywhere? >>> If you're running 'tesseract train.tif ...', it just needs to be in >>> the current directory. >>> >>>> 2.About run the tesseract on my training image, it asks to run >>>> 'tesseract train.tif train batch.nochop makebox' , I guess I should >>>> use the terminal, but when I type this command into it, it keep saying >>>> 'tesseract command not found', I tried to run the configure terminal >>>> first and type 'make', but it is still not working. >>> You also need to use 'make install', or provide a path to the >>> executable - Unix-like systems (unlike DOS, etc.) do not include the >>> current directory in the executable search path. (You can, of course, >>> change that but it's A Bad Idea.) >>> >>> If tesseract is in /home/jim and $PWD (use 'echo $PWD') is /home/jim I >>> could use: >>> ./tesseract ... >>> ('.' means 'this directory') >>> /home/jim/tesseract >>> (the full path) >>> or even >>> ../jim/tesseract >>> ('..' means 'one level lower' - in this case, '/home') >>> or even: >>> $PWD/tesseract >>> >>> ($PWD is an environment variable, and will always be there... unless >>> you remove it from another shell, but you probably don't need to worry >>> about that). >>> >>> I think MacOS uses /User or something else, just substitute with >>> actual values. Using 'make install' will be more convenient, though. >>> -- >>> <Leftmost> jimregan, that's because deep inside you, you are evil. >>> <Leftmost> Also not-so-deep inside you. >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "tesseract-ocr" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]<tesseract-ocr%[email protected]> >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/tesseract-ocr?hl=en. >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<tesseract-ocr%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en. >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

