Ainie, If you have done with your Urdu OCR, may you please send me your work at [email protected] . i need it for my Final Year Project as a part of it. i will be very greatfull to you.
On Monday, November 3, 2008 at 12:23:04 PM UTC+5, Ainie wrote: > > Hi all > > I am working with the Urdu OCR. I came to know about Tesseract. I tried > to train tesseract for the Urdu characters. In the training procedure's > instruction , it is written that it cannot support the right to left > writing style. I myself tried to training the simple alphabets of Urdu as > follows: > > 1 I made the characters txt file with name UrduCharacters.txt with > utf8 encoding > 2. Then from it TIF image is obtained and saved as UrduCharacters.tif > 3 Run the tesseract command to makebox file > *1 tesseract UrduCharacters.tif UrduCharacters > batch.nochop makebox* > > > 2 *tesseract UrduCharacters.tif UrduCharacters -l urd > batch.nochop makebox* > I have tried the both the commands for training . In the second one the > error occurs indicating the message that "Unable to locate Urdunichaset > file" > In the second one the boxfile is generated with four character which are > ~, 7,7,! . If anyone has any idea about it please let me know. > > > Regards > Ainie > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4849b31c-da59-45d3-9fbc-8f46f9becd66%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

