[Corpora-List] DANTE resurrected: public release of the DANTE English lexical database

Miloš Jakubíček via Corpora Wed, 09 Oct 2024 05:47:46 -0700

We proudly announce the public availability of the DANTE lexical database,
developed originally by Sue Atkins, Adam Kilgarriff and Michael Rundell in
2010 for Foras na Gaeilge, which decided to release the dataset under the
CC-BY licence recently.


"DANTE – the Database of ANalysed Texts of English – is a lexical database
which provides a corpus-based description of the core vocabulary of
English. It records the semantic, grammatical, combinatorial, and text-type
characteristics of over 42,000 single-word lemmas and 23,000 compounds and
phrasal verbs, and it also includes over 27,000 idioms and phrases."
(Rundell & Atkins, 2010)

The dataset is provided in a Lexonomy instance running at
https://dantedictionary.com/ (including API) as well as raw data at
https://github.com/lexicalcomputing/dante.

Regards,

Miloš Jakubíček
Lexical Computing

References:
* https://dantedictionary.com/
* https://github.com/lexicalcomputing/dante
* Convery, C., Mianáin, P. O., Raghallaigh, M. O., Atkins, S., Kilgarriff,
A., & Rundell, M. (2010). The DANTE Database (Database of ANalysed Texts of
English) [Conference paper]. Proceedings of the XIV EURALEX International
Conference

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] DANTE resurrected: public release of the DANTE English lexical database

Reply via email to