I guess you came across a limitation of my program. Thanks for your valuable feedback. Now please do following steps.
1. Create a directory C:\tesseract_urdu\ (to avoid problems lets create a directory without spaces) 2. Download tesseract-2.01.exe.tar.gz<http://tesseract-ocr.googlecode.com/files/tesseract-2.01.exe.tar.gz>. Then copy following files to the directory C:\tesseract_urdu\ i) tesseract.exe (C:\tesseract_urdu\teseract.exe) ii) cnTraining.exe (C:\tesseract_urdu\cnTraining.exe) iii) mfTraining.exe (C:\tesseract_urdu\mfTraining.exe iv) unicharset_extractor.exe (C:\tesseract_urdu\unicharset_extractor.exe) v) wordlist2dawg.exe (C:\tesseract_urdu\wordlist2dawg.exe) 3. Download tesseract-2.03.tar.gz<http://tesseract-ocr.googlecode.com/files/tesseract-2.03.tar.gz>(source files). Then copy tessdata directory to C:\tesseract_urdu\ The result would be C:\tesseract_urdu\tessdata\* 4. Download tesseract-2.00.eng.tar.gz (language pack). Then copy all eng.* files to C:\tesseract_urdu\tessdata\ The result would be C:\tesseract_urdu\tessdata\eng.* 5. Now open JTesseract and set Tools->Options tesseract directory to C:\tesseract_urdu\ Here onwards, follow the quick startup guide. regards, -- *Ruwan Janapriya * http://www.janapriya.net 2008/11/3 Qurat-ul-Ain Akram <[EMAIL PROTECTED]> > Hi > I am new to the Jtesseract. I tried to train the Urdu character set. I > followed the same procedure as described by You. The Summary is as Follows: > 1. I copied the tesseract exes from the tesseract web site. this has the > following files: > In *C:\tesseract on urdu\tesseract-2.01.exe.tar* files are > tessdll.dll > tessdll (object file library) > tesseract (log file) > tesseract.exe > dlltest.exe > In *C:\tesseract on urdu\tesseract-2.01.exe.tar\Training *files are > cnTraining.exe > mfTraining.exe > unicharset_extractor.exe > wordlist2dawg.exe > inttemp (type is file) > Microfeat (type is file) > pffmtable (type is file) > normproto (type is file) > > > In the First Step I mention the language code as *urd*. (I also has tried > for the eng language code but fails to generate the box file). > Now as in the Step 2 mentioned, From *Tools->Options* the path is set as > *C:\tesseract on urdu\tesseract-2.01.exe.tar* > in Step 3 file folder is selected. > In step 4, when I load the images then the following errors are reported. > I attached the image file as well as the txt file so that U can easily judge > where I am doing mistake > > > > > read_variables_file:Can't open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/files\ > > Urduread_variables_file:Can't open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/ChracterSet.TIF > > read_variables_file:Can't open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/ > > C:\tesseractread_variables_file:Can't open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/ > > onread_variables_file:Can't open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/urdu\training\ > > Trainingread_variables_file:Can't open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/files\Urduread_variables_file:Can't > open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/ChracterSetread_variables_file:Can't > open C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/configs/batch.nochopread_variables_file:Can't > open C:/tesseract on urdu/tesseract-2.01.exe.tar/tessdata/configs/ > > makeboxUnable to load unicharset file C:/tesseract on > urdu/tesseract-2.01.exe.tar/tessdata/eng.unicharset > > > > I will be very thankful to U , If U can give my any help regarding this > problem > > Anxiously waiting for your reply > > > > Ainie > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

