It looks like it didn't like your image. Can you upload it/ attach it? Command line 1 is correct. Ray.
On Mon, Nov 3, 2008 at 4:32 AM, 74yrs old <[EMAIL PROTECTED]> wrote: > why not try with bbt tool? > > > On Mon, Nov 3, 2008 at 5:10 PM, Qurat-ul-Ain Akram <[EMAIL PROTECTED]>wrote: > >> Thanks for ur Immediate reply >> I Followed the instructions given in the wiki site. But fail at the step >> of generating the Box files ( in the very first step). This is the main >> problem that why I cannot proceed further. I need the developer assistance >> to suggest me, whether my there is problem in my procedure OR where I have >> to make changes in the code so that Tesseract can generate the box file with >> the Urdu character set. >> >> >> >> On 11/3/08, 74yrs old <[EMAIL PROTECTED]> wrote: >>> >>> eight datafiles have to be generated. Please visit wiki website of >>> tesseract where how to generate datafiles are explained in detail.AT >>> present tesseract supports for left to right. In case if you suceeded to >>> generate datafiles, you hsve to read opposite direction i.e. left to right. >>> cheers >>> >>> On Mon, Nov 3, 2008 at 12:53 PM, Qurat-ul-Ain Akram < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hi all >>>> >>>> I am working with the Urdu OCR. I came to know about Tesseract. I tried >>>> to train tesseract for the Urdu characters. In the training procedure's >>>> instruction , it is written that it cannot support the right to left >>>> writing >>>> style. I myself tried to training the simple alphabets of Urdu as follows: >>>> >>>> 1 I made the characters txt file with name UrduCharacters.txt with >>>> utf8 encoding >>>> 2. Then from it TIF image is obtained and saved as >>>> UrduCharacters.tif >>>> 3 Run the tesseract command to makebox file >>>> *1 tesseract UrduCharacters.tif UrduCharacters >>>> batch.nochop makebox* >>>> >>>> >>>> 2 *tesseract UrduCharacters.tif UrduCharacters -l urd >>>> batch.nochop makebox* >>>> I have tried the both the commands for training . In the second one the >>>> error occurs indicating the message that "Unable to locate Urdunichaset >>>> file" >>>> In the second one the boxfile is generated with four character which are >>>> ~, 7,7,! . If anyone has any idea about it please let me know. >>>> >>>> >>>> Regards >>>> Ainie >>>> >>>> >>>> >>> >>> >>> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

