thank you for your concern over this matter, your work is really important
and much appreciated.

2017-03-29 23:34 GMT+01:00 Ray Smith <[email protected]>:

> Thanks for spotting this!
> I understand why it makes this error, but it will take some thought to fix
> it properly!
> It is using a sort by x-position to re-order the boxes for RTL language
> training, but that doesn't work in the case of heavily kerned characters
> like ل in your example.
> It needs to simply reverse the RTL characters, but has to avoid messing up
> the order of the common script, which is why I was using a sort to begin
> with.
> https://github.com/tesseract-ocr/tesseract/blob/master/
> training/boxchar.cpp#L202
>
> On Thursday, March 9, 2017 at 5:11:49 AM UTC-8, El Fakir Zakaria wrote:
>>
>> I noticed that tesseract4 reads الأ as األ which is pretty close, because
>> we need to switch the position of the last 2 letters to have ا ل أ, this
>> happens with similar word forms too like لا reads as ال and should be ل ا,
>> and i wish to correct it.
>> can someone show me how to fix this, or maybe update arabic data.
>> thank you for your time.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/d993e1d4-1978-40f8-9917-331613925457%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/d993e1d4-1978-40f8-9917-331613925457%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CALjY3nP_w7yBS8RvF934czw_igG%2BEGE6sv79H%2BaoY%3D1F-CgfdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to