I understand just difference RTL language with LTR is at unicharset. i create unichraset with its tool but how can i create xheight for persian. there is my unicharset after convert it to RTL 36 NULL 0 NULL 0 Joined 7 0,69,188,255,486,1218,0,30,486,1188 Latin 1 0 1 Joined # Joined [4a 6f 69 6e 65 64 ]a |Broken|0|1 f 0,69,186,255,892,2138,0,80,892,2058 Common 2 10 2 |Broken|0|1 # Broken س 1 0,255,0,255,0,0,0,0,0,0 Arabic 3 13 3 س # س [633 200d ]x ل 1 0,255,0,255,0,0,0,0,0,0 Inherited 4 18 4 ل # ل [200d 644 200d ]x ا 1 0,255,0,255,0,0,0,0,0,0 Inherited 5 18 5 ا # ا [200d 627 ]x م 1 0,64,134,241,51,272,0,46,56,313 Arabic 6 13 6 م # م [645 ]x ع 1 0,255,0,255,0,0,0,0,0,0 Arabic 7 13 7 ع # ع [639 200d ]x ی 1 0,255,0,255,0,0,0,0,0,0 Inherited 8 18 8 ی # ی [200d 6cc 200d ]x ک 1 0,255,0,255,0,0,0,0,0,0 Inherited 9 18 9 ک # ک [200d 6a9 200d ]x م 1 0,255,0,255,0,0,0,0,0,0 Inherited 10 18 10 م # م [200d 645 ]x م 1 0,255,0,255,0,0,0,0,0,0 Arabic 11 13 11 م # م [645 200d ]x ه 1 0,255,0,255,0,0,0,0,0,0 Inherited 12 18 12 ه # ه [200d 647 200d ]x ذ 1 0,255,0,255,0,0,0,0,0,0 Inherited 13 18 13 ذ # ذ [200d 630 ]x ا 1 26,117,200,255,11,181,7,82,33,222 Arabic 14 13 14 ا # ا [627 ]x ک 1 0,255,0,255,0,0,0,0,0,0 Arabic 15 13 15 ک # ک [6a9 200d ]x ج 1 0,255,0,255,0,0,0,0,0,0 Inherited 16 18 16 ج # ج [200d 62c 200d ]x ی 1 0,255,0,255,0,0,0,0,0,0 Arabic 17 13 17 ی # ی [6cc 200d ]x ی 1 0,255,0,255,0,0,0,0,0,0 Inherited 18 18 18 ی # ی [200d 6cc ]x ش 1 0,255,0,255,0,0,0,0,0,0 Arabic 19 13 19 ش # ش [634 200d ]x م 1 0,255,0,255,0,0,0,0,0,0 Inherited 20 18 20 م # م [200d 645 200d ]x ل 1 0,255,0,255,0,0,0,0,0,0 Arabic 21 13 21 ل # ل [644 200d ]x ن 1 0,255,0,255,0,0,0,0,0,0 Inherited 22 18 22 ن # ن [200d 646 ]x ب 1 0,255,0,255,0,0,0,0,0,0 Inherited 23 18 23 ب # ب [200d 628 200d ]x ز 1 0,255,0,255,0,0,0,0,0,0 Inherited 24 18 24 ز # ز [200d 632 ]x ت 1 0,255,0,255,0,0,0,0,0,0 Inherited 25 18 25 ت # ت [200d 62a ]x . 10 12,108,64,140,18,52,9,77,52,193 Common 26 6 26 . # . [2e ]p و 1 0,68,137,238,65,290,0,27,62,256 Arabic 27 13 27 و # و [648 ]x ن 1 0,255,0,255,0,0,0,0,0,0 Arabic 28 13 28 ن # ن [646 200d ]x س 1 0,255,0,255,0,0,0,0,0,0 Inherited 29 18 29 س # س [200d 633 200d ]x ن 1 0,88,163,255,68,321,0,52,76,354 Arabic 30 13 30 ن # ن [646 ]x ب 1 0,255,0,255,0,0,0,0,0,0 Arabic 31 13 31 ب # ب [628 200d ]x و 1 0,255,0,255,0,0,0,0,0,0 Inherited 32 18 32 و # و [200d 648 ]x پ 1 0,255,0,255,0,0,0,0,0,0 Arabic 33 13 33 پ # پ [67e 200d ]x ر 1 0,255,0,255,0,0,0,0,0,0 Inherited 34 18 34 ر # ر [200d 631 ]x ی 1 0,71,148,225,95,253,0,45,103,279 Arabic 35 13 35 ی # ی [6cc ]x but "Inherited" don't have any unicharset in langdata and without it train is not so good fpr example i fine tune for "لا". it is part of "Inherited". can you please tell me how can i create xheight for persian's font and about "Inherited" and also about appropriate RTL flags for persian language. thanks On Thursday, August 31, 2017 at 5:32:59 PM UTC+4:30, shree wrote: > > Use tesstrain.sh for training. > > It should apply the appropriate RTL flags for persian language. > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Thu, Aug 31, 2017 at 2:39 PM, Ava Nimaee <[email protected] > <javascript:>> wrote: > >> Hi i need your help >> i need to create boxfile and unicharset for Persian language. i used the >> syntax that i used for Latin. but the results are revers. could you please >> tell me how do i do this? >> thanks >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/42bf0393-8b56-43c2-b88d-af68b4967c71%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/42bf0393-8b56-43c2-b88d-af68b4967c71%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d9dc6ac8-9803-4596-adf4-79fcf6fb5559%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

