why not try with bbt tool?

On Mon, Nov 3, 2008 at 5:10 PM, Qurat-ul-Ain Akram <[EMAIL PROTECTED]>wrote:

> Thanks for ur Immediate reply
> I Followed the instructions given in the wiki site. But fail at the step of
> generating the Box files ( in the very first step). This is the main problem
> that why I cannot proceed further. I need the developer assistance to
> suggest me, whether my there is problem in my procedure OR  where I have to
> make changes in the code so that Tesseract can generate the box file with
> the Urdu character set.
>
>
>
> On 11/3/08, 74yrs old <[EMAIL PROTECTED]> wrote:
>>
>> eight datafiles have to be generated.  Please visit wiki website of
>> tesseract  where how to generate datafiles are explained in detail.AT
>> present tesseract supports for left to right. In case if you suceeded to
>> generate datafiles, you hsve to read opposite direction i.e. left to right.
>> cheers
>>
>> On Mon, Nov 3, 2008 at 12:53 PM, Qurat-ul-Ain Akram <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi all
>>>
>>> I am working  with the Urdu OCR. I came to know about Tesseract. I tried
>>> to train tesseract for the Urdu characters. In the training procedure's
>>> instruction , it is written that it cannot support the right to left writing
>>> style. I myself tried to training the simple alphabets of Urdu  as follows:
>>>
>>> 1      I made the characters txt file with name UrduCharacters.txt with
>>> utf8 encoding
>>> 2.     Then from it TIF image is obtained and saved as UrduCharacters.tif
>>>
>>> 3      Run the tesseract command to makebox file
>>>               *1   tesseract UrduCharacters.tif  UrduCharacters
>>> batch.nochop makebox*
>>>
>>>
>>>               2    *tesseract UrduCharacters.tif  UrduCharacters  -l urd
>>> batch.nochop makebox*
>>> I have tried the both the commands for training . In the second one the
>>> error occurs indicating the message that "Unable to locate Urdunichaset
>>> file"
>>> In the second one the boxfile is generated with four character which are
>>>  ~, 7,7,! . If anyone has any idea about it please let me know.
>>>
>>>
>>> Regards
>>> Ainie
>>>
>>>
>>>
>>
>> >>
>>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to