Dear Daniel,

Is there not an easyer way to do this, because I use GUI when I work and 
this is my problem:

I'm trying to train Tesseract for Kurdish, this is good too for the Persian, 
Kurdish has some more other letters, but the way of writing is the same as 
Arabic or Farsi. The problem I'm getting is that the final OCR result is not 
from right to left, but from left to right, which means that u can't read the 
text, but the letters r correct. I use  qt-box-editor to edit the box, then I 
use Serak tesseract Trainer V0.4 to train the OCR, after all I put the 
Traineddata file in the Tesseract dir., every thing goes well except the 
missing Arabic mechanism of writing from right to left.


So is there any way to change that unicharset file with a GUI i.s.o. the 
command line?

Thanks alot in advanced
Karo

Op maandag 15 juli 2013 01:02:59 UTC+2 schreef Daniel:
>
> Thanks WHITE N. & sdk.
>
> Both of you helped me so much! thank you!
>
> For anybody else that looking for solution to this problem (with non 
> correct unicharset file generated by unicharset_extractor)
> I also port the python script to correct the unicharset file to php, so if 
> anyone need such code, you can send me email and I will send you the code.
>
>
> On Sunday, July 7, 2013 12:45:18 PM UTC+3, Daniel wrote:
>>
>> Hi everyone,
>>
>> I worked on a project that I need to do training for rtl languages. 
>> (hebrew and arabic)
>> After I do the training process everything works great, except that the 
>> text printed as ltr text.
>> Is there any flag to set during the training process that tell tesseract 
>> to treat the trained file as rtl language file so he can print the text in 
>> the right order?
>>
>> Thanks for helping!
>> Daniel
>>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to