[Corpora-List] Re: Church Slavonic resources

Ales Horak via Corpora Wed, 29 Mar 2023 07:42:47 -0700

Dear Alexander,

whithin the Czech AHISTO project we have OCRed about 300,000
pages from Czech medieval sources FONTES related to the Hussite
era.


The current corpus data contain more than 3 million sentences
(84 million tokens) mostly in Old Czech (36 million tokens),
German and Latin. The corpus is available for download at
https://nlp.fi.muni.cz/trac/ahisto/wiki/NerDataset#Corpus

kind regards,
-- 
Ales Horak
Natural Language Processing Centre (NLP Centre)
Faculty of Informatics
Masaryk University
Brno, Czech Republic



Alexander Osherenko via Corpora wrote on Mar 29, 2023:
> Hi,
> 
> I'm looking for digital old church Slavonic resources such as corpora,
> treebanks, wordnets or raw texts. I am aware of the GORAZD: The Old Church
> Slavonic Digital Hub <http://www.gorazd.org/?q=en/node/21> or the TOROT
> treebank at https://universaldependencies.org, but maybe I miss something.
> Thanks, Alexander
> --
> Alexander Osherenko, Dr. rer. nat.
> Research Associate
> Bavarian Academy of Sciences and Humanities <http://badw.de/>
> Profile: Socioware Development <http://www.socioware.de/osherenko_page.html>
> Profile: Humboldt-Universität zu Berlin
> <https://wirsindhumboldt.de/de/VKkZNyFaeu>
> Profile: ResearchGate
> <https://www.researchgate.net/profile/Alexander_Osherenko>
> Channel: Youtube <https://www.youtube.com/user/MrOsherenko>

> _______________________________________________
> Corpora mailing list -- [email protected]
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to [email protected]
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] Re: Church Slavonic resources

Reply via email to