Have you had any luck training tesseract for arabic letters or numbers?

On Sunday, November 25, 2018 at 9:09:33 AM UTC+1, [email protected] wrote:
>
> Hi Marwa M. Khan 
>
> Have you generated any tessdataa for arabic-indian number ?
>
> I'm trying to generate one but JTessBoxEditor  does not take 
> arabic-indian numbers, how to fix it ?
>
> On Thursday, July 19, 2018 at 12:52:24 PM UTC+3, Marwa M. Khan wrote:
>>
>> Hello, 
>>
>>    I am trying to train the Tesseract 4.0 with LTSM on  Arabic/Hindi 
>> Digits in windows OS. I found that I need to create box file. Thus, I'm 
>> using JTessBoxEditor 2.0 for creating tiff and box files. However, it fails 
>> when I used JTessBoxEditor 2.0  to generate the .traindata file. Note that 
>> I choose combine_tessdata.exe as tesseract executable, ara.arial.exp0.box 
>> as training data, and training with existing box as a training mode. 
>>
>>
>> The output is the followings:
>>
>> esseract Open Source OCR Engine v4.0.0-beta.1-108-gf291 with Leptonica
>> Page 1
>> Bad box coordinates in boxfile string! ١ ٤٥٤ ٣١٦٣ ٤٦٣ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٢ ٤١٣ ٣١٦٣ ٤٢٨ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٣ ٣٧٣ ٣١٦٣ ٣٩٣ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٤ ٣٣٨ ٣١٦٣ ٣٥٠ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٥ ٢٩٨ ٣١٦٨ ٣١٤ ٣١٨٥ ٠
>>
>> Bad box coordinates in boxfile string! ٦ ٢٥٨ ٣١٦٣ ٢٧٣ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٧ ٢١٩ ٣١٦٣ ٢٣٨ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٨ ١٨٠ ٣١٦٣ ٢٠٠ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٩ ١٤٥ ٣١٦٣ ١٥٩ ٣١٩٠ ٠
>>
>> Bad box coordinates in boxfile string! ٠ ١٠٩ ٣١٦٧ ١١٧ ٣١٧٨ ٠
>>
>> Bad box coordinates in boxfile string! ١ ٤٥٤ ٣٠١٥ ٤٦٣ ٣٠٤٢ ٠
>>
>> Bad box coordinates in boxfile string! ٢ ٤١٣ ٣٠١٥ ٤٢٨ ٣٠٤٢ ٠
>>
>> Bad box coordinates in boxfile string! ٣ ٣٧٣ ٣٠١٥ ٣٩٣ ٣٠٤٢ ٠
>>
>> Bad box coordinates in boxfile string! ٤ ٣٣٨ ٣٠١٥ ٣٥٠ ٣٠٤٢ ٠
>>
>> Bad box coordinates in boxfile string! ٥ ٢٩٨ ٣٠٢٠ ٣١٤ ٣٠٣٧ ٠
>>
>> Bad box coordinates in boxfile string! ٦ ٢٥٨ ٣٠١٥ ٢٧٣ ٣٠٤٢ ٠
>>  
>>
>> Could you please tell me where I did wrong or how to fix this error? 
>>
>>
>> Best Regards, 
>> Marwa M. Khan  
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3d1aa31a-6af2-4b90-a1e6-b93f9b792de9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to