Please see https://github.com/Shreeshrii/tessdata_arabic

You can try the new traineddata from  there alongwith the PR
https://github.com/tesseract-ocr/tesseract/pull/2266

On Mon, Feb 25, 2019 at 9:27 PM Soufiane Sabiri <[email protected]>
wrote:

> Have you had any luck training tesseract for arabic letters or numbers?
>
> On Sunday, November 25, 2018 at 9:09:33 AM UTC+1, [email protected] wrote:
>>
>> Hi Marwa M. Khan
>>
>> Have you generated any tessdataa for arabic-indian number ?
>>
>> I'm trying to generate one but JTessBoxEditor  does not take
>> arabic-indian numbers, how to fix it ?
>>
>> On Thursday, July 19, 2018 at 12:52:24 PM UTC+3, Marwa M. Khan wrote:
>>>
>>> Hello,
>>>
>>>    I am trying to train the Tesseract 4.0 with LTSM on  Arabic/Hindi
>>> Digits in windows OS. I found that I need to create box file. Thus, I'm
>>> using JTessBoxEditor 2.0 for creating tiff and box files. However, it fails
>>> when I used JTessBoxEditor 2.0  to generate the .traindata file. Note that
>>> I choose combine_tessdata.exe as tesseract executable, ara.arial.exp0.box
>>> as training data, and training with existing box as a training mode.
>>>
>>>
>>> The output is the followings:
>>>
>>> esseract Open Source OCR Engine v4.0.0-beta.1-108-gf291 with Leptonica
>>> Page 1
>>> Bad box coordinates in boxfile string! ١ ٤٥٤ ٣١٦٣ ٤٦٣ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٢ ٤١٣ ٣١٦٣ ٤٢٨ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٣ ٣٧٣ ٣١٦٣ ٣٩٣ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٤ ٣٣٨ ٣١٦٣ ٣٥٠ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٥ ٢٩٨ ٣١٦٨ ٣١٤ ٣١٨٥ ٠
>>>
>>> Bad box coordinates in boxfile string! ٦ ٢٥٨ ٣١٦٣ ٢٧٣ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٧ ٢١٩ ٣١٦٣ ٢٣٨ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٨ ١٨٠ ٣١٦٣ ٢٠٠ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٩ ١٤٥ ٣١٦٣ ١٥٩ ٣١٩٠ ٠
>>>
>>> Bad box coordinates in boxfile string! ٠ ١٠٩ ٣١٦٧ ١١٧ ٣١٧٨ ٠
>>>
>>> Bad box coordinates in boxfile string! ١ ٤٥٤ ٣٠١٥ ٤٦٣ ٣٠٤٢ ٠
>>>
>>> Bad box coordinates in boxfile string! ٢ ٤١٣ ٣٠١٥ ٤٢٨ ٣٠٤٢ ٠
>>>
>>> Bad box coordinates in boxfile string! ٣ ٣٧٣ ٣٠١٥ ٣٩٣ ٣٠٤٢ ٠
>>>
>>> Bad box coordinates in boxfile string! ٤ ٣٣٨ ٣٠١٥ ٣٥٠ ٣٠٤٢ ٠
>>>
>>> Bad box coordinates in boxfile string! ٥ ٢٩٨ ٣٠٢٠ ٣١٤ ٣٠٣٧ ٠
>>>
>>> Bad box coordinates in boxfile string! ٦ ٢٥٨ ٣٠١٥ ٢٧٣ ٣٠٤٢ ٠
>>>
>>>
>>> Could you please tell me where I did wrong or how to fix this error?
>>>
>>>
>>> Best Regards,
>>> Marwa M. Khan
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/3d1aa31a-6af2-4b90-a1e6-b93f9b792de9%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/3d1aa31a-6af2-4b90-a1e6-b93f9b792de9%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWLeVKjkMBQw8t%3DOOtFHs6ZxFvgCT7UPKEGK6b_cOW54w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to