I guess you came across a limitation of my program. Thanks for your valuable
feedback. Now please do following steps.

1. Create a directory C:\tesseract_urdu\         (to avoid problems lets
create a directory without spaces)
2. Download 
tesseract-2.01.exe.tar.gz<http://tesseract-ocr.googlecode.com/files/tesseract-2.01.exe.tar.gz>.
Then copy following files to the directory C:\tesseract_urdu\
       i) tesseract.exe (C:\tesseract_urdu\teseract.exe)
       ii) cnTraining.exe  (C:\tesseract_urdu\cnTraining.exe)       iii)
mfTraining.exe (C:\tesseract_urdu\mfTraining.exe
       iv) unicharset_extractor.exe
(C:\tesseract_urdu\unicharset_extractor.exe)
       v) wordlist2dawg.exe (C:\tesseract_urdu\wordlist2dawg.exe)
3. Download 
tesseract-2.03.tar.gz<http://tesseract-ocr.googlecode.com/files/tesseract-2.03.tar.gz>(source
files). Then copy tessdata directory to C:\tesseract_urdu\
       The result would be C:\tesseract_urdu\tessdata\*
4. Download tesseract-2.00.eng.tar.gz (language pack). Then copy all eng.*
files to C:\tesseract_urdu\tessdata\
       The result would be C:\tesseract_urdu\tessdata\eng.*
5. Now open JTesseract and set Tools->Options tesseract directory to
C:\tesseract_urdu\

Here onwards, follow the quick startup guide.

regards,

--
*Ruwan Janapriya *
http://www.janapriya.net



2008/11/3 Qurat-ul-Ain Akram <[EMAIL PROTECTED]>

> Hi
> I am new to the Jtesseract. I  tried to train the Urdu character set. I
> followed the same procedure as described by You. The Summary is as Follows:
> 1.  I copied the tesseract exes from the tesseract web site. this has the
> following files:
> In *C:\tesseract on urdu\tesseract-2.01.exe.tar* files are
>          tessdll.dll
>         tessdll (object file library)
>          tesseract (log file)
>         tesseract.exe
>         dlltest.exe
> In *C:\tesseract on urdu\tesseract-2.01.exe.tar\Training *files are
> cnTraining.exe
> mfTraining.exe
> unicharset_extractor.exe
> wordlist2dawg.exe
> inttemp (type is file)
> Microfeat (type is file)
> pffmtable (type is file)
> normproto (type is file)
>
>
> In the First Step I mention the language code as *urd*.  (I also has tried
> for the eng language code but fails to generate the box file).
> Now as in the  Step 2 mentioned, From *Tools->Options*  the path is set as
> *C:\tesseract on urdu\tesseract-2.01.exe.tar*
> in Step 3 file folder is selected.
> In step 4, when I load the images then the following  errors are reported.
> I attached the image file as well as the txt file so that U can easily judge
> where I am doing mistake
>
>
>
>
> read_variables_file:Can't open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/files\
>
> Urduread_variables_file:Can't open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/ChracterSet.TIF
>
> read_variables_file:Can't open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/
>
> C:\tesseractread_variables_file:Can't open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/
>
> onread_variables_file:Can't open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/urdu\training\
>
> Trainingread_variables_file:Can't open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/files\Urduread_variables_file:Can't
> open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/ChracterSetread_variables_file:Can't
> open C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/configs/batch.nochopread_variables_file:Can't
> open C:/tesseract on urdu/tesseract-2.01.exe.tar/tessdata/configs/
>
> makeboxUnable to load unicharset file C:/tesseract on
> urdu/tesseract-2.01.exe.tar/tessdata/eng.unicharset
>
>
>
> I will be very thankful to U , If U can give my any help regarding this
> problem
>
> Anxiously waiting for your reply
>
>
>
> Ainie
>
>
>
>
>
>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to