Dear all,
 
(Our apologies if you have received multiple copies of this announcement)
 
We are happy to announce that the new corpus of the European Parliament is now 
available.
 
The Digital Corpus of the European Parliament (DCEP) contains the majority of 
the documents published on the European Parliament's official website. It 
comprises a variety of document types, from press releases to session and 
legislative documents related to the European Parliament's activities and 
bodies.
 
The current version consists of various document types covering a wide range of 
subject domains. With a total of 1.37 billion words in 23 languages (253 
language pairs), gathered in the course of ten years, this is the largest 
single release of documents by a European Union institution. It includes 
different document types produced between 2001 and 2012, excluding only the 
documents already existing in the Europarl corpus to avoid overlapping.
 
To download and for more information, see: 
https://ec.europa.eu/jrc/en/language-technologies/dcep
 
For a more detailed description of DCEP and when making reference to DCEP in 
scientific publications, please refer to:

Hajlaoui Najeh, Kolovratnik David, Väyrynen Jaakko, Steinberger Ralf, and Varga 
Dániel (2014). DCEP-Digital Corpus of the European Parliament. Proc. LREC 2014 
(Language Resources and Evaluation Conference). Reykjavik, Iceland. Mai 26-31, 
2014. pp 3164-3171 
(URL:http://www.lrec-conf.org/proceedings/lrec2014/pdf/943_Paper.pdf).
Best regards
Najeh Hajlaoui
 
 
 

        
Dr. Najeh HAJLAOUI
Project Manager for Machine Translation
DG TRAD – European Parliament
L-2929 Luxembourg
E-mail: [email protected]
P Please consider the environment before printing this email
 
 
 
 
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list

Reply via email to