did you ever get it to work for urdu? i am trying the same would appreciate 
some help please.

On Monday, November 3, 2008 7:23:04 AM UTC, Ainie wrote:
>
> Hi all 
>  
> I am working  with the Urdu OCR. I came to know about Tesseract. I tried 
> to train tesseract for the Urdu characters. In the training procedure's 
> instruction , it is written that it cannot support the right to left 
> writing style. I myself tried to training the simple alphabets of Urdu  as 
> follows:
>  
> 1      I made the characters txt file with name UrduCharacters.txt with 
> utf8 encoding
> 2.     Then from it TIF image is obtained and saved as UrduCharacters.tif 
> 3      Run the tesseract command to makebox file 
>               *1   tesseract UrduCharacters.tif  UrduCharacters 
> batch.nochop makebox* 
>  
>  
>               2    *tesseract UrduCharacters.tif  UrduCharacters  -l urd 
> batch.nochop 
> makebox* 
> I have tried the both the commands for training . In the second one the 
> error occurs indicating the message that "Unable to locate Urdunichaset 
> file"
> In the second one the boxfile is generated with four character which are 
>  ~, 7,7,! . If anyone has any idea about it please let me know. 
>  
>  
> Regards
> Ainie
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to