Please see https://github.com/Shreeshrii/tesstrain-xsa

On Sat, Mar 7, 2020 at 6:54 PM Shree Devi Kumar <[email protected]>
wrote:

> I have created an example traineddata for xsa. I will upload later today.
> You can then modify with a larger training text and run training.
>
> On Sat, Mar 7, 2020, 02:58 aby tesh <[email protected]> wrote:
>
>> I think it is, most likely , Right To Left, it has passed that error now
>>>> using eng since i only have the traindata for it,  the other issue i am
>>>> encountering is
>>>
>>>
>> === Starting training for language 'eng'
>> [Sat 07 Mar 2020 12:26:06 AM EAT] /usr/bin/text2image
>> --fonts_dir=./sabaean_fonts/ --ptsize 12 --font=Sabaean
>> --outputbase=/tmp/fc-cache/sample_text.txt
>> --text=/tmp/fc-cache/sample_text.txt --fontconfig_tmpdir=/tmp/fc-cache
>> Fontconfig warning: "/tmp/fc-cache/fonts.conf", line 4: Use of ambiguous
>> path in <dir> element. please add prefix="cwd" if current behavior is
>> desired.
>> Stripped 1 unrenderable words
>> Rendered page 0 to file /tmp/fc-cache/sample_text.txt.tif
>>
>> === Phase I: Generating training images ===
>> Rendering using Sabaean
>> [Sat 07 Mar 2020 12:26:08 AM EAT] /usr/bin/text2image
>> --fontconfig_tmpdir=/tmp/fc-cache --fonts_dir=./sabaean_fonts/
>> --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0
>> --exposure=0 --outputbase=/tmp/eng-2020-03-07.lif/eng.Sabaean.exp0
>> --max_pages=0 --font=Sabaean --ptsize 12
>> --text=./tesslang/eng/eng.training_text
>> Fontconfig warning: "/tmp/fc-cache/fonts.conf", line 4: Use of ambiguous
>> path in <dir> element. please add prefix="cwd" if current behavior is
>> desired.
>> Stripped 2 unrenderable words
>> Rendered page 0 to file /tmp/eng-2020-03-07.lif/eng.Sabaean.exp0.tif
>>
>> === Phase UP: Generating unicharset and unichar properties files ===
>> [Sat 07 Mar 2020 12:26:08 AM EAT] /usr/bin/unicharset_extractor
>> --output_unicharset /tmp/eng-2020-03-07.lif/eng.unicharset --norm_mode 1
>> /tmp/eng-2020-03-07.lif/eng.Sabaean.exp0.box
>> Failed to read data from: /tmp/eng-2020-03-07.lif/eng.Sabaean.exp0.box
>> Wrote unicharset file /tmp/eng-2020-03-07.lif/eng.unicharset
>> [Sat 07 Mar 2020 12:26:08 AM EAT] /usr/bin/set_unicharset_properties -U
>> /tmp/eng-2020-03-07.lif/eng.unicharset -O
>> /tmp/eng-2020-03-07.lif/eng.unicharset -X
>> /tmp/eng-2020-03-07.lif/eng.xheights --script_dir=./langdata
>> Loaded unicharset of size 3 from file
>> /tmp/eng-2020-03-07.lif/eng.unicharset
>> Setting unichar properties
>> Setting script properties
>> Failed to load script unicharset from:./langdata/Latin.unicharset
>> Writing unicharset to file /tmp/eng-2020-03-07.lif/eng.unicharset
>>
>> === Phase E: Generating lstmf files ===
>> Using TESSDATA_PREFIX=./tessdata/
>> [Sat 07 Mar 2020 12:26:08 AM EAT] /usr/bin/tesseract
>> /tmp/eng-2020-03-07.lif/eng.Sabaean.exp0.tif
>> /tmp/eng-2020-03-07.lif/eng.Sabaean.exp0 --psm 6 lstm.train
>> read_params_file: Can't open lstm.train
>> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
>> Page 1
>> ERROR: /tmp/eng-2020-03-07.lif/eng.Sabaean.exp0.lstmf does not exist or
>> is not readable
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/ee9d5e16-328e-480d-ab2c-4ca4de708381%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/ee9d5e16-328e-480d-ab2c-4ca4de708381%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXzjuDjdTtKPCrDK85v%2Bi12nUCBxy9X_W44%2Bi32ZT_hdQ%40mail.gmail.com.

Reply via email to