We are excited to announce the release of the first parsed corpus of spoken 
Dutch dialects, the Gesproken Corpus van de zuidelijk-Nederlandse Dialecten 
(GCND). This resource offers extensive data for linguistic research and is now 
accessible online.

Corpus Highlights:
* Speakers: 1,206 individuals, with the eldest born in 1871.
* Geographical Coverage: 639 distinct locations.
* Audio Data: Over 430 hours of recordings across 650 sessions.
* Transcriptions: Over 600 time-aligned, highly detailed transcriptions.
* Total Tokens: Approximately 4.77 million.
* GrETEL Treebank: 50,111 verified sentences and 452,459 verified tokens.

These figures represent the corpus as of its initial release. Ongoing efforts, 
supported by additional funding (GCND+), aim to expand the corpus with more 
transcriptions, including northern dialects from the Meertens Institute 
collection, and to enhance grammatical annotations. The latest updates are 
available through the corpus application.

Access Information:
The GCND is available online
* GCND corpus application (requires CLARIN login): 
https://gcnd.ivdnt.org<https://gcnd.ivdnt.org/>
* GCND project website: https://www.gcnd.ugent.be/

Acknowledgments:
This project was made possible through the funding of the Research Foundation 
Flanders and the dedicated efforts of numerous student assistants, volunteers 
and our project partners.

The GCND team (at Ghent University):
Anne Breitbarth ([email protected]<mailto:[email protected]>)
Anne-Sophie Ghyselen 
([email protected]<mailto:[email protected]>)
Melissa Farasyn ([email protected]<mailto:[email protected]>)
Lien Hellebaut ([email protected]<mailto:[email protected]>)

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to