Also see the san.config file in the langdata directory

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Wed, Sep 21, 2016 at 2:28 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:

> I had used the new text2image program with the tesstrain.sh utility for
> generating the box/tiff pairs and trained on a large set of fonts -
>
> see
> https://github.com/Shreeshrii/imagessan/blob/master/san95-
> langdata/language-specific.sh
> and
> https://github.com/Shreeshrii/imagessan/blob/master/san95-
> langdata/san.font_properties
>
> Additional unicharset entries are probably because different devanagari
> fonts have different number of glyphs for conjuncts (specially Siddhanta,
> Sanskrit2003, Chandas and Uttara).
>
> I do not currently have that setup (cygwin/msys2 with tesseract) to test.
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Wed, Sep 21, 2016 at 12:54 PM, RKVS Raman <rkvsra...@gmail.com> wrote:
>
>> Hello Shridevi,
>>
>> Thanks for clarifying on the current status of cube. I has worked with
>> tesseract long back. So I will leave the cube module for the time being and
>> focus on using the old adaptive classifier.
>>
>> BTW I tried training https://github.com/Shreeshrii/imagessan/blob/master
>> /san95-langdata/san.training_text on Noto Sans Devnagari and I could get
>> only 1165 entries in unicharset.
>>
>> How did you manage to get 1645 entries with it?
>>
>>
>>
>> Best Regards
>> -Raman
>>
>>
>>
>> On Wed, Sep 21, 2016 at 9:21 AM, ShreeDevi Kumar <shreesh...@gmail.com>
>> wrote:
>>
>>> For Sanskrit, please see https://github.com/Shreeshrii/imagessan
>>>
>>> where I have added the training sources as well as traineddata for two
>>> versions of training. In the testing I did on a small sample of images, it
>>> seemed to perform better than the 3.04 san.traineddata.
>>>
>>> You are welcome to try using them and provide feedback.
>>>
>>> ShreeDevi
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>> On Tue, Sep 20, 2016 at 7:05 PM, rkvsraman <rkvsra...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> In tessdata , I see  cube models only for hindi and not for Marathi and
>>>> Sanskrit thought they have the same script.
>>>>
>>>> Any particular reason for this?
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to tesseract-ocr+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/tesseract-ocr/83d5c408-c869-4c16-8847-78ba2d250763%40goo
>>>> glegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/83d5c408-c869-4c16-8847-78ba2d250763%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-ocr+unsubscr...@googlegroups.com.
>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/CAG2NduXv9SJy%3D9U3rvQAjyu1KVvM-GMiWCYc7F_
>>> 0V0ZB7AQkhA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXv9SJy%3D9U3rvQAjyu1KVvM-GMiWCYc7F_0V0ZB7AQkhA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to tesseract-ocr+unsubscr...@googlegroups.com.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/tesseract-ocr/CABFygUDpwgih4_YehFMNoF9B_WLRoN%
>> 2BiB6YfxQu6rSbszvXzfA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CABFygUDpwgih4_YehFMNoF9B_WLRoN%2BiB6YfxQu6rSbszvXzfA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWVhR2c%2BfP0kRaxb6DJ-HUFvix4Na8yFEe5rJFH%3D%2BxfvQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to