On Wed, Apr 13, 2011 at 2:31 AM, caudex <[email protected]> wrote:
> After using regedit and pointing tessdata_prefix to the right place > and running again I got an error that referred to unicharset. The > entire contents of my tessdata subdirectory is: > > Directory of C:\tesseract\Tesseract-OCR\tessdata > > 04/08/2011 12:50p <DIR> . > 04/08/2011 12:50p <DIR> .. > 04/08/2011 12:50p <DIR> configs > 04/08/2011 12:21p 2,395,687 deu.traineddata > 10/03/2010 08:01a 1,926,792 eng.traineddata > 04/08/2011 12:24p 2,292,872 fra.traineddata > 04/08/2011 12:27p 2,434,628 ita.traineddata > 04/08/2011 12:29p 2,281,434 spa.traineddata > 04/08/2011 12:50p <DIR> tessconfigs > 5 File(s) 11,331,413 bytes > 4 Dir(s) 47,724,969,984 bytes free > > (no unichar type files) > > Now the error is back to: > > C:\tesseract\Tesseract-OCR>tesseract ocr_107.tif beglat > Error openning data file C:\Program Files\Tesseract-OCR\tessdata/ > eng.traineddata > > Well behaved w32 apps like emacs and gnuw32 utilities don't tell > Windows about themselves, why does tesseract have to? > > Installer set user environment variable (You can access it on Windows XP this way: My Computer -> Properties -> Advanced -> Environment Variables->) TESSDATA_PREFIX to installation directory. See [1]. Your posts indicate that you moved tesseract to other place (e.g. you broke your installation). Now you blame tesseract ;-) You get error "Error openning data file" it means that it can not find requested data file because of: 1. TESSDATA_PREFIX point to "wrong" place - you can check it in command line (after you received this error) with command: echo %TESSDATA_PREFIX% 2. TESSDATA_PREFIX points to correct place, but the file did not exists. You can check it by command: dir "%TESSDATA_PREFIX%tessdata" There is report that if you change/remove TESSDATA_PREFIX (via regedit or via My Computer -> Properties ->...) there is need to restart computer. If you need to change it just for opened command line session, you can do it with command: set TESSDATA_PREFIX="your desired path\" [1] http://code.google.com/p/tesseract-ocr/source/browse/trunk/vs2008/tesseract.nsi#210 -- Zdenko > > > On Apr 12, 6:59 pm, caudex <[email protected]> wrote: > > After installing tesseract-ocr 3.0 successfully and running it > > against 3 or 4 pdfs, I now get the following error > > > > C:\tesseract\Tesseract-OCR>tesseract ocr_107.tif beglat > > Error openning data file C:\Program Files\Tesseract-OCR\tessdata/ > > eng.traineddata > > > > A dir on ...\tessdata shows: > > > > 10/03/2010 08:01a 1,926,792 eng.traineddata > > > > Notice the misspelling of openning and the / instead of \ in the > > qualified path to eng.traineddata. > > > > Does any of you have a clue what could be going wrong here after it > > worked correctly a few times? > > I see that tesseract is looking for the tessdata subdirectory in the > > wrong place (Program Files) instead of the current directory (where > > the .tif's were created) but how did it work the first three times? > > Under program files there is no tesseract-ocr subdirectory. > > > > Thanks, > > > > Ed > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

