I understand just difference RTL language with LTR is at unicharset.
i create unichraset with its tool but how can i create xheight for persian. 
there is my unicharset after convert it to RTL
36
NULL 0 NULL 0
Joined 7 0,69,188,255,486,1218,0,30,486,1188 Latin 1 0 1 Joined # Joined 
[4a 6f 69 6e 65 64 ]a
|Broken|0|1 f 0,69,186,255,892,2138,0,80,892,2058 Common 2 10 2 |Broken|0|1 # 
Broken
س‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 3 13 3 س‍ # س‍ [633 200d ]x
‍ل‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 4 18 4 ‍ل‍ # ‍ل‍ [200d 644 200d ]x
‍ا 1 0,255,0,255,0,0,0,0,0,0 Inherited 5 18 5 ‍ا # ‍ا [200d 627 ]x
م 1 0,64,134,241,51,272,0,46,56,313 Arabic 6 13 6 م # م [645 ]x
ع‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 7 13 7 ع‍ # ع‍ [639 200d ]x
‍ی‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 8 18 8 ‍ی‍ # ‍ی‍ [200d 6cc 200d ]x
‍ک‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 9 18 9 ‍ک‍ # ‍ک‍ [200d 6a9 200d ]x
‍م 1 0,255,0,255,0,0,0,0,0,0 Inherited 10 18 10 ‍م # ‍م [200d 645 ]x
م‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 11 13 11 م‍ # م‍ [645 200d ]x
‍ه‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 12 18 12 ‍ه‍ # ‍ه‍ [200d 647 200d ]x
‍ذ 1 0,255,0,255,0,0,0,0,0,0 Inherited 13 18 13 ‍ذ # ‍ذ [200d 630 ]x
ا 1 26,117,200,255,11,181,7,82,33,222 Arabic 14 13 14 ا # ا [627 ]x
ک‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 15 13 15 ک‍ # ک‍ [6a9 200d ]x
‍ج‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 16 18 16 ‍ج‍ # ‍ج‍ [200d 62c 200d ]x
ی‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 17 13 17 ی‍ # ی‍ [6cc 200d ]x
‍ی 1 0,255,0,255,0,0,0,0,0,0 Inherited 18 18 18 ‍ی # ‍ی [200d 6cc ]x
ش‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 19 13 19 ش‍ # ش‍ [634 200d ]x
‍م‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 20 18 20 ‍م‍ # ‍م‍ [200d 645 200d ]x
ل‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 21 13 21 ل‍ # ل‍ [644 200d ]x
‍ن 1 0,255,0,255,0,0,0,0,0,0 Inherited 22 18 22 ‍ن # ‍ن [200d 646 ]x
‍ب‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 23 18 23 ‍ب‍ # ‍ب‍ [200d 628 200d ]x
‍ز 1 0,255,0,255,0,0,0,0,0,0 Inherited 24 18 24 ‍ز # ‍ز [200d 632 ]x
‍ت 1 0,255,0,255,0,0,0,0,0,0 Inherited 25 18 25 ‍ت # ‍ت [200d 62a ]x
. 10 12,108,64,140,18,52,9,77,52,193 Common 26 6 26 . # . [2e ]p
و 1 0,68,137,238,65,290,0,27,62,256 Arabic 27 13 27 و # و [648 ]x
ن‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 28 13 28 ن‍ # ن‍ [646 200d ]x
‍س‍ 1 0,255,0,255,0,0,0,0,0,0 Inherited 29 18 29 ‍س‍ # ‍س‍ [200d 633 200d ]x
ن 1 0,88,163,255,68,321,0,52,76,354 Arabic 30 13 30 ن # ن [646 ]x
ب‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 31 13 31 ب‍ # ب‍ [628 200d ]x
‍و 1 0,255,0,255,0,0,0,0,0,0 Inherited 32 18 32 ‍و # ‍و [200d 648 ]x
پ‍ 1 0,255,0,255,0,0,0,0,0,0 Arabic 33 13 33 پ‍ # پ‍ [67e 200d ]x
‍ر 1 0,255,0,255,0,0,0,0,0,0 Inherited 34 18 34 ‍ر # ‍ر [200d 631 ]x
ی 1 0,71,148,225,95,253,0,45,103,279 Arabic 35 13 35 ی # ی [6cc ]x
but "Inherited"  don't have any unicharset in langdata and without it train 
is not so good
fpr example i fine tune for "لا".   it is part of "Inherited".
can you please tell me how can i create xheight for persian's font and 
about "Inherited" and also about appropriate RTL flags for persian language.
thanks
On Thursday, August 31, 2017 at 5:32:59 PM UTC+4:30, shree wrote:
>
> Use tesstrain.sh for training.
>
> It should apply the appropriate RTL flags for persian language.
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Thu, Aug 31, 2017 at 2:39 PM, Ava Nimaee <[email protected] 
> <javascript:>> wrote:
>
>> Hi i need your help
>> i need to create boxfile and unicharset for Persian language. i used the 
>> syntax that i used for Latin. but the results are revers. could you please 
>> tell me how do i  do this? 
>> thanks
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/42bf0393-8b56-43c2-b88d-af68b4967c71%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/42bf0393-8b56-43c2-b88d-af68b4967c71%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d9dc6ac8-9803-4596-adf4-79fcf6fb5559%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to