Maybe http://dasi.cnr.it does have something usable?
Shree Devi Kumar <[email protected]> schrieb am So., 15. März 2020, 16:55: > There is no online corpus for xsa that I could find. > > Two of the fonts you sent are legacy fonts, that is they map English > letters to ancient Arabic characters. > > Are there any converters that convert from the legacy mapping to Unicode? > > If there is existing text in legacy fonts, it can be converted to Unicode > and that can be used for training. > > On Sun, Mar 15, 2020, 17:57 aby tesh <[email protected]> wrote: > >> Where can i get the training text, or can i create a new one. I have a >> problem writing with fonts which some of included in the attachment i sent >> you. >> >> On Sunday, March 15, 2020 at 4:32:08 AM UTC+3, shree wrote: >>> >>> I had used the findfonts feature of text2image and found only two fonts >>> that rendered the xsa text. I will check the fonts that you sent. What >>> about training text? Unless you have some more text, it will be difficult >>> to do training. >>> >>> Quivira >>> Segoe UI Historic >>> >>> On Sun, Mar 15, 2020, 04:01 aby tesh <[email protected]> wrote: >>> >>>> That is what i am not getting, i don't think they all are unicode >>>> fonts, i couldn't get one. Some render on my machine (Linux) some don't. >>>> >>>> On Saturday, March 14, 2020 at 8:45:46 PM UTC+3, shree wrote: >>>>> >>>>> Are all these Unicode fonts? >>>>> >>>>> What about training text in utf-8 Unicode encoding? >>>>> >>>>> On Sat, Mar 14, 2020, 22:37 aby tesh <[email protected]> wrote: >>>>> >>>>>> Hey shree, I have compiled all relevant fonts and attached them >>>>>> below. I am not sure know how i can generate text data with it. >>>>>> >>>>>> On Tuesday, March 10, 2020 at 5:35:26 AM UTC+3, shree wrote: >>>>>>> >>>>>>> If you can share a large enough training text and fonts, I can rerun >>>>>>> the training. >>>>>>> >>>>>>> On Tue, Mar 10, 2020, 03:41 aby tesh <[email protected]> wrote: >>>>>>> >>>>>>>> Hey, >>>>>>>> >>>>>>>> I followed the steps in the readme file, and i started the >>>>>>>> lstmtraining, but it seems my current computer's processor can't >>>>>>>> handle the >>>>>>>> training for a longer period of time. >>>>>>>> >>>>>>>> What can i do about it? When should i abort the training to get a >>>>>>>> good trainedata file? or is there one which is accurate that you can >>>>>>>> share ? >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/e727f106-d668-44b5-9bba-8fad29fc1587%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/e727f106-d668-44b5-9bba-8fad29fc1587%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/efa79761-20a5-4d20-b0c1-40eb2523c289%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/efa79761-20a5-4d20-b0c1-40eb2523c289%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/1d3e54cc-3f53-4ad3-b870-171bb26fc6eb%40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/1d3e54cc-3f53-4ad3-b870-171bb26fc6eb%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/88bfa189-4a1e-4528-857c-013248b5ee4b%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/88bfa189-4a1e-4528-857c-013248b5ee4b%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVrD9Vo8HUFWe_dr6c6Gs2EPOB2bh9DfkmAtA85cKp8fQ%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVrD9Vo8HUFWe_dr6c6Gs2EPOB2bh9DfkmAtA85cKp8fQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CANuFvMcdEir5VQr0RJCkBKaS-0C%3DE2EaPUpezxtqyKwaRcTAUw%40mail.gmail.com.

