We proudly announce the public availability of the DANTE lexical database,
developed originally by Sue Atkins, Adam Kilgarriff and Michael Rundell in
2010 for Foras na Gaeilge, which decided to release the dataset under the
CC-BY licence recently.

"DANTE – the Database of ANalysed Texts of English – is a lexical database
which provides a corpus-based description of the core vocabulary of
English. It records the semantic, grammatical, combinatorial, and text-type
characteristics of over 42,000 single-word lemmas and 23,000 compounds and
phrasal verbs, and it also includes over 27,000 idioms and phrases."
(Rundell & Atkins, 2010)

The dataset is provided in a Lexonomy instance running at
https://dantedictionary.com/ (including API) as well as raw data at
https://github.com/lexicalcomputing/dante.

Regards,

Miloš Jakubíček
Lexical Computing

References:
* https://dantedictionary.com/
* https://github.com/lexicalcomputing/dante
* Convery, C., Mianáin, P. O., Raghallaigh, M. O., Atkins, S., Kilgarriff,
A., & Rundell, M. (2010). The DANTE Database (Database of ANalysed Texts of
English) [Conference paper]. Proceedings of the XIV EURALEX International
Conference
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to