Dear all, (Our apologies if you have received multiple copies of this announcement) We are happy to announce that the new corpus of the European Parliament is now available. The Digital Corpus of the European Parliament (DCEP) contains the majority of the documents published on the European Parliament's official website. It comprises a variety of document types, from press releases to session and legislative documents related to the European Parliament's activities and bodies. The current version consists of various document types covering a wide range of subject domains. With a total of 1.37 billion words in 23 languages (253 language pairs), gathered in the course of ten years, this is the largest single release of documents by a European Union institution. It includes different document types produced between 2001 and 2012, excluding only the documents already existing in the Europarl corpus to avoid overlapping. To download and for more information, see: https://ec.europa.eu/jrc/en/language-technologies/dcep For a more detailed description of DCEP and when making reference to DCEP in scientific publications, please refer to:
Hajlaoui Najeh, Kolovratnik David, Väyrynen Jaakko, Steinberger Ralf, and Varga Dániel (2014). DCEP-Digital Corpus of the European Parliament. Proc. LREC 2014 (Language Resources and Evaluation Conference). Reykjavik, Iceland. Mai 26-31, 2014. pp 3164-3171 (URL:http://www.lrec-conf.org/proceedings/lrec2014/pdf/943_Paper.pdf). Best regards Najeh Hajlaoui Dr. Najeh HAJLAOUI Project Manager for Machine Translation DG TRAD – European Parliament L-2929 Luxembourg E-mail: [email protected] P Please consider the environment before printing this email
_______________________________________________ Mt-list site list [email protected] http://lists.eamt.org/mailman/listinfo/mt-list
