No need to change "Tesseract executable" setting. You need an entry in 
.font_properties file for arialunicodems font.

I strongly suggest you re-read the training wiki before continuing on.

https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3

On Thursday, November 20, 2014 8:07:35 AM UTC-6, iram akbar wrote:
>
> it seems its a known issue of Serak. i have created the "ara" folder with 
> files as "vie" folder in jtessbox editor as you can see in attachment. 
> after that i have set the box file path in jtessbox editor of "Tesseract 
> executable" and "Training data" for "ara" as attached. when i click the 
> "Run" button i got the attached error. i don't know what goes wrong here.
> Question: m i giving the wrong file in the path in "Tesseract executable" 
> and "Training data" i.e ara box file? or what goes wrong.
> note: i have put no data words_list, frequent_words, font_properties file. 
>
>
> On 20 November 2014 17:32, ShreeDevi Kumar <shree...@gmail.com 
> <javascript:>> wrote:
>
>> I have not used Serak - but the issues page there indicates problems with 
>> RTL languages - see 
>> https://code.google.com/p/serak-tesseract-trainer/issues/detail?id=6
>>
>> why are u not using jtessbox editor's trainer or the command line 
>> programs? I think the binaries are bundled with JTess...
>>
>>
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Thu, Nov 20, 2014 at 4:26 PM, iram akbar <irama...@gmail.com 
>> <javascript:>> wrote:
>>
>>> Hello shree,
>>>
>>> i am having an issue while training arabic in Serak (for box file 
>>> generation i am using jtessbox editor). i am doing some testing. i have 
>>> assigned  english alphabet for a single arabic word and created the box 
>>> file as attached (jtessbox file). now following all training process in 
>>> serak i got the OCR result as attached. although you can see in the box 
>>> file there is 4 alphabets "A,B,C,D" but i was expecting OCR result will be 
>>> ABCD but the results are BDBBAABBBBA as attached (serak result).
>>> Question: why i a getting that result? is it some wrong while making box 
>>> file in jtessbox editor or training in serak?
>>>
>>> On Monday, 10 November 2014 15:30:21 UTC+5, shree wrote:
>>>>
>>>> Look under jtessboxeditor/samples/vie folder
>>>>
>>>> and create similar files for your language
>>>>
>>>> ShreeDevi
>>>> ____________________________________________________________
>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>
>>>> On Mon, Nov 10, 2014 at 1:10 PM, iram akbar <irama...@gmail.com> wrote:
>>>>
>>>>> Quan,
>>>>> i am able to generate some files with jtess ox editor but i am having 
>>>>> an issue, when i select "Train with existing box" or "Train from Scratch" 
>>>>> under the *Traine*r tab i am getting this attached message.
>>>>> Question: How i can generate the Arabic.font_properties, 
>>>>> Arabic.frequent_word_list and Arabic.words_list files using jtessbox 
>>>>> editor?
>>>>>
>>>>> On Friday, 7 November 2014 19:42:37 UTC+5, Quan Nguyen wrote:
>>>>>>
>>>>>> Look in samples folder for a working example. You can start out from 
>>>>>> a UTF-8 text file about 2-page long, generate TIFF/Box from it, and 
>>>>>> prepare 
>>>>>> other necessary input files for training. You can train entirely in 
>>>>>> jTessBoxEditor.
>>>>>>
>>>>>> On Thursday, November 6, 2014 6:19:53 AM UTC-6, iram akbar wrote:
>>>>>>>
>>>>>>> thank you for your help but my issue still exits. if i need to 
>>>>>>> generate the Tiff of an image text i am unable to generate the TIFF as 
>>>>>>> it 
>>>>>>> only ask to load the text file not image file. second if i have a lots 
>>>>>>> of 
>>>>>>> documents i need to copy paste first then generate the TIFF. Any one 
>>>>>>> else 
>>>>>>> can help me in this.
>>>>>>> Question: how can i Input the Arabic text image in jtessbox editor 
>>>>>>> to generate Tiff (as attached). 
>>>>>>>
>>>>>>> On Thursday, 6 November 2014 16:38:25 UTC+5, shree wrote:
>>>>>>>>
>>>>>>>> Click on the 'generate' box - with some devanagri fonts I have 
>>>>>>>> found that text does not display but the tiff/box are generated. Maybe 
>>>>>>>> same 
>>>>>>>> for the arabic font you are using. Give it a try.
>>>>>>>>
>>>>>>>> You can also try to copy and paste the text, sometimes that works.
>>>>>>>>
>>>>>>>>
>>>>>>>> ShreeDevi
>>>>>>>> ____________________________________________________________
>>>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>>>>
>>>>>>>>
>>>>>>>>  -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to tesseract-oc...@googlegroups.com.
>>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/tesseract-ocr/d7396d3d-c4d1-4fcc-a58d-6cc02927989c%
>>>>> 40googlegroups.com 
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/d7396d3d-c4d1-4fcc-a58d-6cc02927989c%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to tesseract-oc...@googlegroups.com <javascript:>.
>>> To post to this group, send email to tesser...@googlegroups.com 
>>> <javascript:>.
>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/1422c53d-8ad5-4339-8e4a-3de540a3dfa5%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/1422c53d-8ad5-4339-8e4a-3de540a3dfa5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "tesseract-ocr" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/tesseract-ocr/QQ8wC59YKUI/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> tesseract-oc...@googlegroups.com <javascript:>.
>> To post to this group, send email to tesser...@googlegroups.com 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWieFAj7ZnJKRTYPwL-UzJWnTK-wRSFPZgOEy-%2BM4D4-g%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWieFAj7ZnJKRTYPwL-UzJWnTK-wRSFPZgOEy-%2BM4D4-g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/1998ed6e-144e-4d5d-8a4e-eafd8794f062%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to