Thanks for ur Immediate reply
I Followed the instructions given in the wiki site. But fail at the step of
generating the Box files ( in the very first step). This is the main problem
that why I cannot proceed further. I need the developer assistance to
suggest me, whether my there is problem in my procedure OR  where I have to
make changes in the code so that Tesseract can generate the box file with
the Urdu character set.



On 11/3/08, 74yrs old <[EMAIL PROTECTED]> wrote:
>
> eight datafiles have to be generated.  Please visit wiki website of
> tesseract  where how to generate datafiles are explained in detail.AT
> present tesseract supports for left to right. In case if you suceeded to
> generate datafiles, you hsve to read opposite direction i.e. left to right.
> cheers
>
> On Mon, Nov 3, 2008 at 12:53 PM, Qurat-ul-Ain Akram <[EMAIL PROTECTED]
> > wrote:
>
>> Hi all
>>
>> I am working  with the Urdu OCR. I came to know about Tesseract. I tried
>> to train tesseract for the Urdu characters. In the training procedure's
>> instruction , it is written that it cannot support the right to left writing
>> style. I myself tried to training the simple alphabets of Urdu  as follows:
>>
>> 1      I made the characters txt file with name UrduCharacters.txt with
>> utf8 encoding
>> 2.     Then from it TIF image is obtained and saved as UrduCharacters.tif
>> 3      Run the tesseract command to makebox file
>>               *1   tesseract UrduCharacters.tif  UrduCharacters
>> batch.nochop makebox*
>>
>>
>>               2    *tesseract UrduCharacters.tif  UrduCharacters  -l urd 
>> batch.nochop
>> makebox*
>> I have tried the both the commands for training . In the second one the
>> error occurs indicating the message that "Unable to locate Urdunichaset
>> file"
>> In the second one the boxfile is generated with four character which are
>>  ~, 7,7,! . If anyone has any idea about it please let me know.
>>
>>
>> Regards
>> Ainie
>>
>>
>>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to